Reputation: 200
Is there a way to get the two first files from HDFS using command line?. My hadoop version is 2.7.3
I have a folder in HDFS with multiple files, that another application is puting the there: /user/Lab01/inpu/ingestionFile1.json /user/Lab01/inpu/ingestionFile2.json /user/Lab01/inpu/ingestionFile3.json /user/Lab01/inpu/ingestionFile4.json
I need to work just with the first two files based on time, so if list the content using:
$ hdfs dfs -ls -R /user/Lab01/input
-rw------- 3 huser dev 668 2019-02-13 11:34 /user/Lab01/inpu/ingestionFile1.json
-rw------- 3 huser dev 668 2019-02-13 11:36 /user/Lab01/inpu/ingestionFile2.json
-rw------- 3 huser dev 668 2019-02-13 11:38 /user/Lab01/inpu/ingestionFile3.json
-rw------- 3 huser dev 668 2019-02-13 11:41 /user/Lab01/inpu/ingestionFile4.json
In order to get the two first files from the directory I simple pip the command using head -2 to get:
$ hdfs dfs -ls -R /user/Lab01/input | head -2
-rw------- 3 huser dev 668 2019-02-13 11:34 /user/Lab01/inpu/ingestionFile1.json
-rw------- 3 huser dev 668 2019-02-13 11:36 /user/Lab01/inpu/ingestionFile2.json
The normal command to get files from hdfs is using -get:
hdfs dfs -get /user/Lab01/input/fileName
So thats why right now I'm trying to merge this two commands:
$ hdfs dfs -get /user/Lab01/input | hdfs dfs -ls -R /user/Lab01/input | head -2
But I don't get the desire result, I just get a message giving me the output from the last command (hdfs dfs -ls -R /user/Lab01/input | head -2) :
-rw------- 3 huser dev 668 2019-02-13 11:34 /user/Lab01/inpu/ingestionFile1.json
-rw------- 3 huser dev 668 2019-02-13 11:36 /user/Lab01/inpu/ingestionFile2.json
Upvotes: 0
Views: 1531
Reputation: 191874
You can't pipe a -get
to an -ls
You need to first -ls | head -2
, then awk
and cut out the filenames you are listed in, and then individually -get
those two.
Something like this should get the names only
hdfs dfs -ls -R /user/Lab01/input | head -2 | awk '{print $8}'
Also - How to list only the file names in HDFS
Then add just "| xargs hdfs dfs -get
" to download the files
Upvotes: 2