Featured Post

Top Questions People Ask About Pandas, NumPy, Matplotlib & Scikit-learn — Answered!

Image
 Whether you're a beginner or brushing up on your skills, these are the real-world questions Python learners ask most about key libraries in data science. Let’s dive in! 🐍 🐼 Pandas: Data Manipulation Made Easy 1. How do I handle missing data in a DataFrame? df.fillna( 0 ) # Replace NaNs with 0 df.dropna() # Remove rows with NaNs df.isna(). sum () # Count missing values per column 2. How can I merge or join two DataFrames? pd.merge(df1, df2, on= 'id' , how= 'inner' ) # inner, left, right, outer 3. What is the difference between loc[] and iloc[] ? loc[] uses labels (e.g., column names) iloc[] uses integer positions df.loc[ 0 , 'name' ] # label-based df.iloc[ 0 , 1 ] # index-based 4. How do I group data and perform aggregation? df.groupby( 'category' )[ 'sales' ]. sum () 5. How can I convert a column to datetime format? df[ 'date' ] = pd.to_datetime(df[ 'date' ]) ...

The best helpful HDFS File System Commands (2 of 4)

Hadoop+Big data+Jobs, Apply Now
#Top-Selected-HDFS-file-system-commands
CopyFrom Local
Works similarly to the put command, except that the source is restricted to a local file reference.
hdfs dfs -copyFromLocal URI
hdfs dfs -copyFromLocal input/docs/data2.txt hdfs://localhost/user/rosemary/data2.txt

HDFS Commands Part-1of 4

copyToLocal
Works similarly to the get command, except that the destination is restricted to a local file reference.
hdfs dfs -copyToLocal [-ignorecrc] [-crc] URI
hdfs dfs -copyToLocal data2.txt data2.copy.txt

count
Counts the number of directories, files, and bytes under the paths that match the specified file pattern.
hdfs dfs -count [-q]
hdfs dfs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2

cp
Copies one or more files from a specified source to a specified destination. If you specify multiple sources, the specified destination must be a directory.
hdfs dfs -cp URI [URI …]
hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir

du
Displays the size of the specified file, or the sizes of files and directories that are contained in the specified directory. If you specify the -s option, displays an aggregate summary of file sizes rather than individual file sizes. If you specify the -h option, formats the file sizes in a "human-readable" way.

hdfs dfs -du [-s] [-h] URI [URI …]
hdfs dfs -du /user/hadoop/dir1 /user/hadoo

Comments

Popular posts from this blog

SQL Query: 3 Methods for Calculating Cumulative SUM

5 SQL Queries That Popularly Used in Data Analysis

Big Data: Top Cloud Computing Interview Questions (1 of 4)