Featured Post

Step-by-Step Guide to Creating an AWS RDS Database Instance

Image
 Amazon Relational Database Service (AWS RDS) makes it easy to set up, operate, and scale a relational database in the cloud. Instead of managing servers, patching OS, and handling backups manually, AWS RDS takes care of the heavy lifting so you can focus on building applications and data pipelines. In this blog, we’ll walk through how to create an AWS RDS instance , key configuration choices, and best practices you should follow in real-world projects. What is AWS RDS? AWS RDS is a managed database service that supports popular relational engines such as: Amazon Aurora (MySQL / PostgreSQL compatible) MySQL PostgreSQL MariaDB Oracle SQL Server With RDS, AWS manages: Database provisioning Automated backups Software patching High availability (Multi-AZ) Monitoring and scaling Prerequisites Before creating an RDS instance, make sure you have: An active AWS account Proper IAM permissions (RDS, EC2, VPC) A basic understanding of: ...

Apache Storm Architecture Tutorial Flowchart

There are two main reasons why Apache Storm is so popular. The number one is it can connect to many sources. The number two is scalable. The other advantage is fault-tolerant. That means, guaranteed data processing.


Apache Storm topologies

The map-reduce jobs process data analytics in Hadoop. The topology in Storm is the real data processor.
The co-ordination between Nimbus and Supervisor carried by Zookeeper

Apache Storm

  1. The jobs in Hadoop are similar to the topology. The jobs run as per the schedule defined.
  2. In Storm, the topology runs forever.
  3. A topology consists of many worker processes spread across many machines. 
  4. A topology is a pre-defined design to get end product using your data.
  5. A topology comprises of 2 parts. These are Spout and bolts.
  6. The Spout is a funnel for topology
Storm Topology

Two nodes in Storm

  1. Master Node: similar to the Hadoop job tracker. It runs on a daemon called Nimbus.
  2. Worker Node: It runs on a daemon called Supervisor. The Supervisor listens to the work assigned to each machine.

Master Node

  • Nimbus is responsible for distributing the code
  • Monitors failures
  • Assign tasks to each machine

Worker Node

  • It listens to the work assigned by Nimbus.
  • It works under the subset of the topology.

Read More

Comments

Popular posts from this blog

Step-by-Step Guide to Reading Different Files in Python

SQL Query: 3 Methods for Calculating Cumulative SUM

PowerCurve for Beginners: A Comprehensive Guide