Reputation: 35
I want to search in HDFS and list out the files that contains my search string exactly, and my second requirement is that is there any possible way to search for a range of values in a file HDFS.
let suppose below is my file and it contains the following data
/user/hadoop/test.txt
101,abc
102,def
103,ghi
104,aaa
105,bbb
is there any possible way to search with the range [101-104] so that it returns the files which contains the following data range.
.
Upvotes: 0
Views: 2892
Reputation: 2681
To display file names having a pattern. Lets loop through hdfs directory which has files let say.
hdfs_files=`hdfs dfs -ls /user/hadoop/|awk '{print $8}'`
for file in `echo $hdfs_files`;
do
patterns=`hdfs dfs -cat $file|egrep -o "10[1-4]"`
patterns_count=`echo $patterns|tr ' ' "\n"|wc -l`
if [ $patterns_count -eq 4 ]; then
echo $file;
fi
done
Now solution to second requirement "search for a range of values in a file HDFS" using shell command:-
hdfs dfs -cat /user/hadoop/test.txt|egrep "10[1-4]"
output:-
101,abc
102,def
103,ghi
104,aaa
or just match first column
hdfs dfs -cat /user/hadoop/test.txt|egrep -o "10[1-4]"
output:-
101
102
103
104
Upvotes: 1