Sunil
Sunil

Reputation: 149

hadoop fs -text vs hadoop fs -cat vs hadoop fs -get

I believe all the following commands can be used for copying the hdfs files to local file system. What are the differences / situational pro/cons. ( Hadoop newbie here ).

hadoop fs -text /hdfs_dir/* >> /local_dir/localfile.txt
hadoop fs -cat /hdfs_dir/* >> /local_dir/localfile.txt
hadoop fs -get /hdfs_dir/* >> /local_dir/

My thumb-rule is to avoid using 'text' and 'cat' for big files. ( I use it to copy output of my MR job which is usually smaller in my use case ).

Upvotes: 3

Views: 26385

Answers (3)

rogue-one
rogue-one

Reputation: 11577

The main difference between -cat and -text is that text detects the encoding of the file and decodes it to plain text whenever possible whereas cat doesnt do it.

for eg take the example of this lzo compressed file.

using text:

hadoop fs -text hdfs://namenode:8020/user/hive/warehouse/database/000000_0.lzo_deflate
1,foo
2,bar
3,baz
4,hello
5,world

using cat:

 hadoop fs -cat hdfs://namenode:8020/user/hive/warehouse/database/000000_0.lzo_deflate 
    ίiW3�I���2�IJ,�2�U\&:�99�\�:��E9)\֙��"

dfs -get command is used to copy files to local-filesystem.

Upvotes: 10

USB
USB

Reputation: 6139

-text

Usage: hadoop fs -text Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream.

-cat

Usage: hadoop fs -cat URI [URI …] Copies source paths to stdout.

-get

Usage: hadoop fs -get [-ignorecrc] [-crc] Copy files to the local file system. Files that fail the CRC check may be copied with the -ignorecrc option. Files and CRCs may be copied using the -crc option.

Upvotes: 0

user3484461
user3484461

Reputation: 1133

hadoop fs -get 
hadoop fs -copyToLocal 

Above HDFS commands can be used for copying the HDFS files to local system.

hadoop fs -cat 

This command will display the content of the HDFS file on your stdout (console or command prompt).

hadoop fs  -text 

This will display the content of the hdfs file ( But text only work with zip and TextRecordInputStream formats like SequenceFieFormat).

Upvotes: 1

Related Questions