Apricot
Apricot

Reputation: 3011

BashScript move files based on matching part of filenames from a list

I have millions of xml files. The name of the xml file follows this pattern:

ABC_20180912_12345.xml
ABC_20180412_98765.xml
ABC_20180412_45678.xml

From this I want to copy files to a different folder based on the name it has after the underscore. To identify the files, I have a list which I have saved in a csv file which provides me with the required names. An example:

vcfile="/home/mycomp/Documents/wd/vehicles.csv"
vcpvr=`cat $vcfile`

echo $vcpvr provides me with this list:

2894 4249 5464

I am able to loop through the xmlfiles in the folder, open each file and grep to see if the file contains the string and if it is, the move the files to a new location. This is working.

The complete code:

#filesToExtract is the interim folder
fold="/home/mycomp/filesToExtract";
query=$fold/*.xml

vcfile="/home/mycomp/Documents/wd/vehicles.csv"
vcpvr=`cat $vcfile`

#xmlfiles - keep all tar.gz files here
cd ~/xmlfiles/
COUNTER=1
for f in *.tar.gz
do
echo " $COUNTER "
  tar zxf "$f" -C ~/filesToExtract
  for k in $query
  do
   file $k | if grep -q "$vcpvr"
   then
   mv $k  ~/xmlToWork/
   fi
   done
#xmltowork is the final folder
#rm -r ~/filesToExtract/*.xml
COUNTER=$((COUNTER + 1))
done

But since this looks for the string inside the file, instead of filename, it takes longer to process millions of files. Instead, I want to look for the string in the filename and if it is there, move the files. This is what I have tried:

target="/home/mycomp/xmlToWork"

 for k in $query
  do
  if [[ $k =~ "$vcpvr" ]]; then 
    cp -v $k $target
  fi
  done

But this gives me an error tarextract.sh: 12: tarextract.sh: [[: not found

Upvotes: 1

Views: 335

Answers (1)

Inder
Inder

Reputation: 3816

This will work just fine, although I was hesitant to suggest as it will be a slower approach as it involve iteration, but certainly faster than looking into the files.

nn=($(cat vehicles.csv));for x in "${nn[@]}";do ls *.xml|grep "$x"|xargs -I '{}' mv {} folder/;done

multiline version of the same will be:

nn=($(cat test.csv))
for x in "${nn[@]}"
do
ls *.xml|grep "$x"|xargs -I '{}' mv {} /home/inderss/dumps/
done

Upvotes: 1

Related Questions