Reputation: 3011
I have millions of xml files. The name of the xml file follows this pattern:
ABC_20180912_12345.xml
ABC_20180412_98765.xml
ABC_20180412_45678.xml
From this I want to copy files to a different folder based on the name it has after the underscore. To identify the files, I have a list which I have saved in a csv file which provides me with the required names. An example:
vcfile="/home/mycomp/Documents/wd/vehicles.csv"
vcpvr=`cat $vcfile`
echo $vcpvr provides me with this list:
2894 4249 5464
I am able to loop through the xmlfiles in the folder, open each file and grep to see if the file contains the string and if it is, the move the files to a new location. This is working.
The complete code:
#filesToExtract is the interim folder
fold="/home/mycomp/filesToExtract";
query=$fold/*.xml
vcfile="/home/mycomp/Documents/wd/vehicles.csv"
vcpvr=`cat $vcfile`
#xmlfiles - keep all tar.gz files here
cd ~/xmlfiles/
COUNTER=1
for f in *.tar.gz
do
echo " $COUNTER "
tar zxf "$f" -C ~/filesToExtract
for k in $query
do
file $k | if grep -q "$vcpvr"
then
mv $k ~/xmlToWork/
fi
done
#xmltowork is the final folder
#rm -r ~/filesToExtract/*.xml
COUNTER=$((COUNTER + 1))
done
But since this looks for the string inside the file, instead of filename, it takes longer to process millions of files. Instead, I want to look for the string in the filename and if it is there, move the files. This is what I have tried:
target="/home/mycomp/xmlToWork"
for k in $query
do
if [[ $k =~ "$vcpvr" ]]; then
cp -v $k $target
fi
done
But this gives me an error tarextract.sh: 12: tarextract.sh: [[: not found
Upvotes: 1
Views: 335
Reputation: 3816
This will work just fine, although I was hesitant to suggest as it will be a slower approach as it involve iteration, but certainly faster than looking into the files.
nn=($(cat vehicles.csv));for x in "${nn[@]}";do ls *.xml|grep "$x"|xargs -I '{}' mv {} folder/;done
multiline version of the same will be:
nn=($(cat test.csv))
for x in "${nn[@]}"
do
ls *.xml|grep "$x"|xargs -I '{}' mv {} /home/inderss/dumps/
done
Upvotes: 1