Reputation: 149
I believe all the following commands can be used for copying the hdfs files to local file system. What are the differences / situational pro/cons. ( Hadoop newbie here ).
hadoop fs -text /hdfs_dir/* >> /local_dir/localfile.txt
hadoop fs -cat /hdfs_dir/* >> /local_dir/localfile.txt
hadoop fs -get /hdfs_dir/* >> /local_dir/
My thumb-rule is to avoid using 'text' and 'cat' for big files. ( I use it to copy output of my MR job which is usually smaller in my use case ).
Upvotes: 3
Views: 26385
Reputation: 11577
The main difference between -cat and -text is that text detects the encoding of the file and decodes it to plain text whenever possible whereas cat doesnt do it.
for eg take the example of this lzo compressed file.
using text:
hadoop fs -text hdfs://namenode:8020/user/hive/warehouse/database/000000_0.lzo_deflate
1,foo
2,bar
3,baz
4,hello
5,world
using cat:
hadoop fs -cat hdfs://namenode:8020/user/hive/warehouse/database/000000_0.lzo_deflate
ίiW3�I���2�IJ,�2�U\&:�99�\�:��E9)\֙��"
dfs -get command is used to copy files to local-filesystem.
Upvotes: 10
Reputation: 6139
-text
Usage: hadoop fs -text Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream.
-cat
Usage: hadoop fs -cat URI [URI …] Copies source paths to stdout.
-get
Usage: hadoop fs -get [-ignorecrc] [-crc] Copy files to the local file system. Files that fail the CRC check may be copied with the -ignorecrc option. Files and CRCs may be copied using the -crc option.
Upvotes: 0
Reputation: 1133
hadoop fs -get
hadoop fs -copyToLocal
Above HDFS commands can be used for copying the HDFS files to local system.
hadoop fs -cat
This command will display the content of the HDFS file on your stdout (console or command prompt).
hadoop fs -text
This will display the content of the hdfs file ( But text only work with zip and TextRecordInputStream formats like SequenceFieFormat).
Upvotes: 1