user7394882
user7394882

Reputation: 193

Merge csv files in one file

i have a set of csv files , i want to merge them in one csv file. ,it take some times ,but i don't find the file in the destination path

hdfs dfs -getmerge /DATA /data1/result.csv

Any help Thanks

Upvotes: 2

Views: 6420

Answers (2)

Bhavesh
Bhavesh

Reputation: 919

getmerge

Usage: hadoop fs -getmerge [-nl] <src> <localdst>

Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally -nl can be set to enable adding a newline character (LF) at the end of each file. --skip-empty-file can be used to avoid unwanted newline characters in case of empty files.

Examples:

 hadoop fs -getmerge -nl /src /opt/output.txt

 hadoop fs -getmerge -nl /src/file1.txt /src/file2.txt /output.txt

Exit Code:

Returns 0 on success and non-zero on error.

If some how it does not work for you

You can try cat command like this: (If your Data is not large enough)

 hadoop dfs -cat /DATA/* > /<local_fs_dir>/result.csv

 hadoop dfs -copyFromLocal /<local_fs_dir>/result.csv /data1/result.csv

Upvotes: 5

Pushkin
Pushkin

Reputation: 534

You can also try concatenating the file in local linux fs using

cat $DOWNLOAD_DIR/*.csv >> $CONCAT_DIR/<concatenated_filename>.csv

And then put the concatenated file on hdfs.

Upvotes: 0

Related Questions