Rangooski
Rangooski

Reputation: 873

Remove file if a file with the same name but different extension doesn't exist in another directory

I have 3 directories. I want to delete the files in raw and xml which are not in clean_raw.

sample
.
├── clean_raw
│   ├── 1.jpg
│   ├── 2.jpg
│   └── 5.jpg
├── raw
│   ├── 1.jpg
│   ├── 2.jpg
│   ├── 3.jpg
│   ├── 4.jpg
│   └── 5.jpg
└── xml
    ├── 1.xml
    ├── 2.xml
    ├── 3.xml
    ├── 4.xml
    └── 5.xml

I used rsync to delete the same file names from raw directory using the below line.

rsync -r --delete --existing --ignore-existing sample/clean_raw/ sample/raw/

This deletes the images 3.jpg, 4.jpg from raw directory.

How to delete the file xmls from 3.xml, 4.xml using shell script?

Upvotes: 2

Views: 1149

Answers (3)

F. Hauri  - Give Up GitHub
F. Hauri - Give Up GitHub

Reputation: 70762

Re-use rsync output:

Doing so will ensure rm is done along rsyc process and avoid the need of browsing directories, searching for inexistant files.

Using pure :

while read -r action file;do
    [ "$action" = "deleting" ] &&
        [ -f "sample/xml/${file%.*}.xml" ] &&
        rm -v "sample/xml/${file%.*}.xml"
done < <(
    rsync -vr --delete --existing --ignore-existing \
        sample/clean_raw/ sample/raw/
)

may produce:

removed 'sample/xml/5.xml'
removed 'sample/xml/4.xml'

Then you could remove -v option in rm to suppress this output.

Further sample

Building an array containing list of .xml files to delete. This permit to run rm command one time for deleting all files simultaneously.

This sample will list and prompt user before deletion.

rsyncCmd=(rsync -vr --delete --existing --ignore-existing)
source=sample/clean_raw
target=sample/raw
xmlFiles=()
while read -r action file;do
    [ "$action" = "deleting" ] &&
        [ -f "sample/xml/${file%.*}.xml" ] &&
        xmlFiles+=("sample/xml/${file%.*}.xml")
done < <(
    "${rsyncCmd[@]}" "$source/" "$target/"
)
if [ -f "${xmlFiles[0]}" ] ;then
    ls -l "${xmlFiles[@]}"
    read -sn 1 -p "Delete ${#xmlFiles[@]} files (Y/n)? " answer
    case "$answer" in
        n|N )   echo No  ;;
        *   )   echo Yes;
                rm -v "${xmlFiles[@]}";;
    esac
fi

By using sed

Use -v (verbose) switch to rsync command, then trap lines in form deleting file.jpg with sed to create on-the-fly a shell script for removing .xml files.

rsync -vr --delete --existing --ignore-existing sample/clean_raw/ sample/raw/ |
    sed -ne 's/^deleting \(.*\).jpg/rm -v \o47sample\/xml\/\1.xml\o47/p' |
    sh

By using sed, but keeping rsync's logfile.

If for any reason, even debugging, you would store rsync logfile, while executing sed '...rm ..'|sh, you could use tee for this:

rsync -vr --delete --existing --ignore-existing sample/clean_raw/ sample/raw/ |
    tee /path/to/rsynclog.txt |
    sed -ne 's/^deleting \(.*\).jpg/rm -v \o47sample\/xml\/\1.xml\o47/p' |
    sh

You could add -a option to tee command for appending output to existing log file instead of overwriting existing log file if exist.

Upvotes: 2

Raman Sailopal
Raman Sailopal

Reputation: 12877

while read fil
do
   if test ! -f "<path to clean_raw>/$fil"
   then 
       echo "rm -f "<path to xml>/$fil"
       #rm -f "<path to xml>/$fil"
   fi
done <<< (find <path to xml> -name "*.jpg -printf "%f\n")

Use find to attain a list of files in the xml folder and print just the file names with printf. Redirect this into a while loop, that takes each file and checks if it exists in clean_raw. If it doesn't delete the file

Repeat the step for the raw folder

Echo has been used to check that files list as expected. Comment out the echo command and uncomment in rm when you are happy.

Upvotes: 1

oguz ismail
oguz ismail

Reputation: 50750

You're looking for something like this:

for xml in sample/xml/*; do
  jpg=${xml%.*}.jpg
  jpg=${xml%/*/*}/clean_raw/${jpg##*/}
  if ! test -f "$jpg"; then
    echo rm "$xml"
  fi
done

Remove echo if the output looks good.

Upvotes: 3

Related Questions