Featured Post

14 Top Data Pipeline Key Terms Explained

Image
 Here are some key terms commonly used in data pipelines 1. Data Sources Definition: Points where data originates (e.g., databases, APIs, files, IoT devices). Examples: Relational databases (PostgreSQL, MySQL), APIs, cloud storage (S3), streaming data (Kafka), and on-premise systems. 2. Data Ingestion Definition: The process of importing or collecting raw data from various sources into a system for processing or storage. Methods: Batch ingestion, real-time/streaming ingestion. 3. Data Transformation Definition: Modifying, cleaning, or enriching data to make it usable for analysis or storage. Examples: Data cleaning (removing duplicates, fixing missing values). Data enrichment (joining with other data sources). ETL (Extract, Transform, Load). ELT (Extract, Load, Transform). 4. Data Storage Definition: Locations where data is stored after ingestion and transformation. Types: Data Lakes: Store raw, unstructured, or semi-structured data (e.g., S3, Azure Data Lake). Data Warehous...

AWS Block Vs Object Storage Top Differences

In AWS, Block and Object are two types of storage. I have given differences between these two. Why because storage is a prime concept in the cloud environment.

Object Vs Block Storage

Why are these names different? Because these two are different storage types - Object and Block.


Object Storage


  • Object means it is a single object. You are not dividing here.
  • In the context of AWS, object storage helps your file to store as-is. How big it does not matter.
  • Let your file size is 10MB. Then, it saves as a 10 MB file.
  • What happens when you update a 30MB file. It deletes the old object and creates a brand new one.
  • For small changes, you need to update the whole file. So, it utilizes a lot of resources.
  • Object storage is much better for big files and very few changes.
  • AWS manages object storage.
  • AWS has full control over Object storage.


Block Storage


  • Block storage divides your file into blocks.
  • You have selected a block size of 512 bytes. If you want to upload a 10MB file, it then divides the whole file into 20 blocks.
  • When you want to update a single character, it updates only that Block. It will not touch other blocks.
  • You can save network and bandwidth use in Block storage.
  • When the changes are more, and you want to update very often, then Block storage is much better.
  • In Block storage volumes are mountable.
  • AWS has no visibility on blocks inside of Block storage.
  • It has visibility only on Block volumes.

Keep Reading

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

SQL Query: 3 Methods for Calculating Cumulative SUM

Big Data: Top Cloud Computing Interview Questions (1 of 4)