Reputation: 919
(Note: I need to use distcp to get parallelism)
I have 2 files in /user/bhavesh folder
I have 1 file in /user/bhavesh1 folder
Copying 2 files from /user/bhavesh to /user/uday folder (This work fine)
This create /user/uday folder
Copying 1 file from /user/bhavesh1 to /user/uday1 folder if creates file instead of folder
What i need is if there is one file /user/bhavesh1/emp1.csv i need is it should create /user/uday1/emp1.csv [uday1 should form as directory] Any suggestion or help is highly appreciated.
Upvotes: 1
Views: 2171
Reputation: 8522
In unix systems, when u copy a single file by giving destination directory name ending with /user/uday1/, destination directory will be created, however hadoop fs -cp command will fail if destination directory is missing.
When it comes it hdfs distcp, file/dir names ending with / will be ignored if it's a single file. One workaround is to create the destination directory before executing distcp command. you may add -p option in -mkdir to avoid directory already exists error.
hadoop fs -mkdir -p /user/uday1 ; hadoop distcp /user/bhavesh1/emp*.csv /user/uday1/
this works for both single file and multiple files in the source directory.
Upvotes: 1