Posts

Showing posts with the label hadoop architecture

The hadoop.apache.org web site defines Hadoop as "a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models." Quite simply, that's the philosophy: to provide a framework that's simple to use, can be scaled easily, and provides fault tolerance and high availability for production usage. The idea is to use existing low-cost hardware to build a powerful system that can process petabytes of data very efficiently and quickly. More : Top selected Hadoop Interview Questions Hadoop achieves this by storing the data locally on its DataNodes and processing it locally as well. All this is managed efficiently by the NameNode, which is the brain of the Hadoop system. All client applications read/write data through NameNode. Hadoop has two main components: the Hadoop Distributed File System (HDFS) and a framework for processing large amounts of data in parallel using the MapReduce paradigm HDFS ...

Search This Blog

ApplyBigAnalytics

Posts

Featured Post

Python: Built-in Functions vs. For & If Loops – 5 Programs Explained

Top Hadoop Architecture Interview Questions