Reputation: 65
I have these files in Hadoop and want the list of all files whose creation date is greater than 2016-11-21
.
-rw-r----- 3 pharpan1 hadoop 73439 2017-01-02 15:20 manpoc_pre
-rw-r----- 3 pharpan1 hadoop 12190 2017-02-02 19:42 message.txt
-rw-r----- 3 pharpan1 hadoop 374 2016-11-14 18:18 newbin
-rw-r----- 3 pharpan1 hadoop 614 2016-11-14 18:19 newcalcpi
-rw-r----- 3 pharpan1 hadoop 154 2016-11-21 20:12 newspoc
I tried the command below but it's printing all the files. How to get only the one's which satisfy the condition
dateA='2016-11-21'
hdfs dfs -ls -t | awk '{if($6 -ge dateA) print $8;}'
Upvotes: 4
Views: 2446
Reputation: 328
filtering all the files created for a given date
hadoop fs -ls <path> | grep <filter_date> | sort
Upvotes: 0
Reputation: 146
You could try something like this:
First, determine the number of days between now and 2016-11-21:
$ (( DAYS = ($(date +"%s") - $(date +"%s" -d "2016-11-21")) / ( 24 * 3600 ) ))
$ echo $DAYS
108
Next, use that variable to find the files:
find /my/directory -ctime -${DAYS} -type f
Upvotes: 0
Reputation: 92854
Pass the input date as a variable into awk
expression(via -v
option):
dateA='2016-11-21'
hdfs dfs -ls -t | awk -v dateA="$dateA" '{if ($6 > dateA) {print $8}}'
The output:
manpoc_pre
message.txt
Upvotes: 3