Reputation: 193
i have a set of csv files , i want to merge them in one csv file. ,it take some times ,but i don't find the file in the destination path
hdfs dfs -getmerge /DATA /data1/result.csv
Any help Thanks
Upvotes: 2
Views: 6420
Reputation: 919
getmerge
Usage: hadoop fs -getmerge [-nl] <src> <localdst>
Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally -nl
can be set to enable adding a newline character (LF) at the end of each file. --skip-empty-file
can be used to avoid unwanted newline characters in case of empty files.
Examples:
hadoop fs -getmerge -nl /src /opt/output.txt
hadoop fs -getmerge -nl /src/file1.txt /src/file2.txt /output.txt
Exit Code:
Returns 0 on success and non-zero on error.
If some how it does not work for you
You can try cat command like this: (If your Data is not large enough)
hadoop dfs -cat /DATA/* > /<local_fs_dir>/result.csv
hadoop dfs -copyFromLocal /<local_fs_dir>/result.csv /data1/result.csv
Upvotes: 5
Reputation: 534
You can also try concatenating the file in local linux fs using
cat $DOWNLOAD_DIR/*.csv >> $CONCAT_DIR/<concatenated_filename>.csv
And then put the concatenated file on hdfs.
Upvotes: 0