Posts

Showing posts with the label interview questions and answers

Featured Post

14 Top Data Pipeline Key Terms Explained

Image
 Here are some key terms commonly used in data pipelines 1. Data Sources Definition: Points where data originates (e.g., databases, APIs, files, IoT devices). Examples: Relational databases (PostgreSQL, MySQL), APIs, cloud storage (S3), streaming data (Kafka), and on-premise systems. 2. Data Ingestion Definition: The process of importing or collecting raw data from various sources into a system for processing or storage. Methods: Batch ingestion, real-time/streaming ingestion. 3. Data Transformation Definition: Modifying, cleaning, or enriching data to make it usable for analysis or storage. Examples: Data cleaning (removing duplicates, fixing missing values). Data enrichment (joining with other data sources). ETL (Extract, Transform, Load). ELT (Extract, Load, Transform). 4. Data Storage Definition: Locations where data is stored after ingestion and transformation. Types: Data Lakes: Store raw, unstructured, or semi-structured data (e.g., S3, Azure Data Lake). Data Warehous...

Top SAP HANA Iot must read Interview Questions(3 of 3)

The below is my third set of interview questions. In this lot I have given ten interview questions for your quick reference. What is SAP HANA? SAP deployed SAP HANA as an integrated solution that combines software and hardware, which is frequently referred to as the SAP HANA appliance. As with SAP NetWeaver Business Warehouse Accelerator (SAP NetWeaver BW Accelerator), SAP partners with several hardware vendors to provide the infrastructure that is needed to run the SAP HANA software. Lenovo partnered with SAP to provide an integrated solution. 2) What is memory for CORE ratio in SAP HANA? For in-memory computing appliances, such as SAP HANA, the amount of main memory is important. In-memory computing brings data that is kept on disk into main memory. This action allows for much faster processing of the data because the CPU cores do not have to wait until the data is loaded from disk to memory, which means each CPU is better used. SQLDBC:An SAP native database SDK that ca...

Top Hive interview Questions for quick read (1 of 2)

Image
The selected interview questions on HIVE. Hive is a technology being used in Hadoop eco system. 1) What are major activities in Hadoop eco system? Within the Hadoop ecosystem, HDFS can load and store massive quantities of data in an efficient and reliable manner. It can also serve that same data back up to client applications, such as MapReduce jobs, for processing and data analysis. 2)What is the role of HIVE in HADOOP Eco system? Hive, often considered the Hadoop data warehouse platform, got its start at Facebook as their analyst struggled to deal with the massive quantities of data produced by the social network. Requiring analysts to learn and write MapReduce jobs was neither productive nor practical. Stockphotos.io 3)What is Hive in Hadoop? Facebook developed a data warehouse-like layer of abstraction that would be based on tables. The tables function merely as metadata, and the table schema is projected onto the data, instead of actually moving potentially ma...

SAP HANA In-memory Real Usage

Below are the list of questions on SAP HANA In-memory. That explains the real usage. 1. What is in-memory computing? A1) In-memory computing is a technology that allows the processing of massive quantities of data in main memory to provide immediate results from analysis and transaction.  The data that is processed is ideally real-time data (that is, data that is available for processing or analysis immediately after it is created). 2. How in-memory computing works ? A2) Keep data in main memory to speed up data access. Minimize data movement by using the columnar storage concept, compression, and performing calculations at the database level.  Divide and conquer. Use the multi-core architecture of modern processors and multi-processor servers (or even scale out into a distributed landscape) to grow beyond what can be supplied by a single server. 3. What is the benefit of keeping data in memory? A3) Data accessing from main memory is much faster than accessing data from ...

10 Top NoSQL Database Recently Asked Interview Questions

Image
1) Who is involved in developing NoSQL? Amazon and Google Papers 2) What is NoSQL? You can use NoSQL on non-relational databases. Like columnar databases, by using NoSQL, you can query data from non-relational databases. 3) What are the unique features of NoSQL databases? no relationship between records need Un-structural data store data that individual records do not have a relationship with each other 4) How NoSQL-databases are faster than traditional RDBMS? Stores database on multiple servers, rather than storing the whole database in a single server Adding replicas on other servers, we can retrieve data faster even one of the servers crashes 5) What are the UNIQUE features of NoSQL? Opensource ACID complaint 6) What are the characteristics of a good NoSQL product? High availability: Fault tolerance when a single server goes down Disaster recovery: For when a data center goes down, or more likely, someone digs up a network cable just outside the data center Support: Someone to st...

Big Data:Top Hadoop Interview Questions (2 of 5)

Image
Frequently asked Hadoop interview questions. 1. What is Hadoop? Hadoop is a framework that allows users the power of distributed computing. 2.What is the difference between SQL and Hadoop? SQL is allowed to work with structured data. But SQL is most suitable for legacy technologies. Hadoop is suitable for unstructured data. And, it is well suited for modern technologis. Hadoop 3. What is Hadoop framework? It is distributed network of commodity servers(A server can contain multiple clusters, and a cluster can have multiple nodes) 4. What are 4 properties of Hadoop? Accessible-Hadoop runs on large clusters of commodity machines Robust-An assumption that low commodity machines cause many machine failures. But it handles these tactfully.  Scalable-Hadoop scales linearly to handle larger data by adding more nodes to the cluster.  Simple-Hadoop allows users to quickly write efficient parallel code 5. What kind of data Hadoop needs? Traditional RDBMS having re...