Featured Post

14 Top Data Pipeline Key Terms Explained

Image
 Here are some key terms commonly used in data pipelines 1. Data Sources Definition: Points where data originates (e.g., databases, APIs, files, IoT devices). Examples: Relational databases (PostgreSQL, MySQL), APIs, cloud storage (S3), streaming data (Kafka), and on-premise systems. 2. Data Ingestion Definition: The process of importing or collecting raw data from various sources into a system for processing or storage. Methods: Batch ingestion, real-time/streaming ingestion. 3. Data Transformation Definition: Modifying, cleaning, or enriching data to make it usable for analysis or storage. Examples: Data cleaning (removing duplicates, fixing missing values). Data enrichment (joining with other data sources). ETL (Extract, Transform, Load). ELT (Extract, Load, Transform). 4. Data Storage Definition: Locations where data is stored after ingestion and transformation. Types: Data Lakes: Store raw, unstructured, or semi-structured data (e.g., S3, Azure Data Lake). Data Warehous...

Cloud Storage as a Service Basics (2 of 3)

The really awesome point is cloud storage. Yes, you are storing data in cloud. But you need to understand here few good things about it.
 
What is cloud storage...
Cloud storage involves exactly what the name suggests—storing your data with a cloud service provider rather than on a local system. As with other cloud services, you access the data stored on the cloud via an Internet link.

Even though data is stored and accessed remotely, you can maintain data both locally and on the cloud as a measure of safety and redundancy. Cloud storage has a number of advantages over traditional data storage:

The benefits..
  • If you store your data on a cloud, you can get at it from any location that has Internet access. 
  • This makes it especially appealing to road warriors. 
  • Workers don’t need to use the same computer to access data nor do they have to carry around physical storage devices. 
  • Also, if your organization has branch offices, they can all access the data from the cloud provider.
The Basics: There are hundreds of different cloud storage systems, and some are very specific in what they do. Some are niche-oriented and store just email or digital pictures, while others store any type of data. Some providers are small, while others are huge and fill an entire warehouse.

One of Google’s datacenters in Oregon is the size of a football field and houses thousands of servers.

At the most rudimentary level, a cloud storage system just needs one data server connected to the Internet. A subscriber copies files to the server over the Internet, which then records the data. When a client wants to retrieve the data, he or she accesses the data server with a web-based interface, and the server then either sends the files back to the client or allows the client to access and manipulate the data itself.

How cloud storage works...

Cloud storage systems utilize dozens or hundreds of data servers. Because servers require maintenance or repair, it is necessary to store the saved data on multiple machines, providing redundancy. Without that redundancy, cloud storage systems couldn’t assure clients that they could access their information at any given time. Most systems store the same data on servers using different power supplies. That way, clients can still access their data even if a power supply fails.

Summary...
Many clients use cloud storage not because they’ve run out of room locally, but for safety. If something happens to their building, then they haven’t lost all their data.

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

SQL Query: 3 Methods for Calculating Cumulative SUM

Big Data: Top Cloud Computing Interview Questions (1 of 4)