Featured Post

14 Top Data Pipeline Key Terms Explained

Image
 Here are some key terms commonly used in data pipelines 1. Data Sources Definition: Points where data originates (e.g., databases, APIs, files, IoT devices). Examples: Relational databases (PostgreSQL, MySQL), APIs, cloud storage (S3), streaming data (Kafka), and on-premise systems. 2. Data Ingestion Definition: The process of importing or collecting raw data from various sources into a system for processing or storage. Methods: Batch ingestion, real-time/streaming ingestion. 3. Data Transformation Definition: Modifying, cleaning, or enriching data to make it usable for analysis or storage. Examples: Data cleaning (removing duplicates, fixing missing values). Data enrichment (joining with other data sources). ETL (Extract, Transform, Load). ELT (Extract, Load, Transform). 4. Data Storage Definition: Locations where data is stored after ingestion and transformation. Types: Data Lakes: Store raw, unstructured, or semi-structured data (e.g., S3, Azure Data Lake). Data Warehous...

R Language basics for Beginners to Apply in Analytics

In the early days, a key feature of R was that its syntax is very similar to S, making it easy for S-PLUS users to switch over. While the R’s syntax is nearly identical to that of S’s, R’s semantics, while superficially similar to S, are quite different.

R Language basics for Beginners to Apply in Analytics


Steps to learn R Language


In fact, R is technically much closer to the Scheme language than it is to the original S language when it comes to how R works under the hood. Today R runs on almost any standard computing platform and operating system. Its open-source nature means that anyone is free to adapt the software to whatever platform they choose.

#R language basics


Indeed, R has been reported to be running on modern tablets, phones, PDAs, and game consoles. One nice feature that R shares with many popular open-source projects is frequent releases. These days there is a major annual release, typically in October, where major new features are incorporated and released to the public. Throughout the year, smaller-scale bugfix releases will be made as needed.


Releases -The frequent releases and regular release cycle indicates active development of the software and ensures that bugs will be addressed in a timely manner. 

Of course, while the core developers control the primary source tree for R, many people around the world make contributions in the form of new features, bug fixes, or both. Another key advantage that R has over many other statistical packages (even today) is its sophisticated graphics capabilities. 


R’s ability to create “publication quality” graphics has existed since the very beginning and has generally been better than competing packages.

Today, with many more visualization packages available than before, that trend continues. R’s base graphics system allows for very fine control over essentially every aspect of a plot or graph.


Other newer graphics systems, like lattice and ggplot2, allow for complex and sophisticated visualizations of high-dimensional data. R has maintained the original S philosophy, which is that it provides a language that is both useful for interactive work but contains a powerful programming language for developing new tools.

This allows the user, who takes existing tools and applies them to data, to slowly but surely become a developer who is creating new tools.


Finally, one of the joys of using R has nothing to do with the language itself, but rather with the active and vibrant user community. 


In many ways, a language is successful inasmuch as it creates a platform with which many people can create new things. R is that platform and thousands of people around the world have come together to make contributions to R, to develop packages, and help each other use R for all kinds of applications.


The R-help and R-devel mailing lists have been highly active for over a decade now and there is considerable activity on websites like Stack Overflow.

Comments

Popular posts from this blog

SQL Query: 3 Methods for Calculating Cumulative SUM

Big Data: Top Cloud Computing Interview Questions (1 of 4)

How to Fix datetime Import Error in Python Quickly