Featured Post

14 Top Data Pipeline Key Terms Explained

Image
 Here are some key terms commonly used in data pipelines 1. Data Sources Definition: Points where data originates (e.g., databases, APIs, files, IoT devices). Examples: Relational databases (PostgreSQL, MySQL), APIs, cloud storage (S3), streaming data (Kafka), and on-premise systems. 2. Data Ingestion Definition: The process of importing or collecting raw data from various sources into a system for processing or storage. Methods: Batch ingestion, real-time/streaming ingestion. 3. Data Transformation Definition: Modifying, cleaning, or enriching data to make it usable for analysis or storage. Examples: Data cleaning (removing duplicates, fixing missing values). Data enrichment (joining with other data sources). ETL (Extract, Transform, Load). ELT (Extract, Load, Transform). 4. Data Storage Definition: Locations where data is stored after ingestion and transformation. Types: Data Lakes: Store raw, unstructured, or semi-structured data (e.g., S3, Azure Data Lake). Data Warehous...

These 2 top skills you need to become an analyst

Pools of master data present in repositories play a big role in data analytics. For example, data is already re-posited in data warehouses. Example, product data, customer data etc.
tech and soft skills

Big data needs mixed skills. For example - technical skills and some soft skills.

1# Technical Skills

One being able to administer software frameworks like:
  • Hadoop, 
  • expertise in databases like noSQL, 
  • Cassandra or HBase 
  • analytics programming languages and facilities like R or Pig.

2# Soft Kills

Ability of people to think broadly across the organization, to understand the bottom-line needs of the business, to know which analytics questions to pose to get to those bottom lines, and to measure and communicate results. Additional Technical Skills - SAS, Cognos

Related

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

SQL Query: 3 Methods for Calculating Cumulative SUM

Big Data: Top Cloud Computing Interview Questions (1 of 4)