Shell Script to delete specific image-files recursively

Question

I do have a third-party program, which uploads files to a webserver. These files are images, in different folders and with different names. Those files get references into a database. The program imports new images and upload those to those folders. If there is an existing file, it just takes the name and add a special counter, create a new reference in the database and the old one will be removed. But instead of removing the file as well, it keeps a copy.

Lets say, we have a image-file name "109101.jpg". There is a new version of the file and it will be uploaded with the filename: "109101_1.jpg". This goes further till "109101_103.jpg" for example. Now, all the 103 files before this one are outdated and could be deleted.

Due to the fact, that the program is not editable and third-party, I am not able to change that behavior. Instead, I need a Shell script, which walks through those folders and deletes all the images before the latest one. So only "109101_103.jpg" will survive and all the others before this number will be deleted. And as a side effect, there are also images, with a double underscored name (only these, no tripple ones or so). For example: "109013_35_1.jpg" is the original one, the next one is "109013_35_1_1.jpg" and now its at "109013_35_1_24.jpg". So only "109013_35_1_24.jpg" has to survive.

Right now I am not even having an idea, how to solve this problem. Any ideas?

cha0site · Accepted Answer

Here's a one line pipeline, because I felt like it. Shown with newlines inserted because I'm not evil.

for F in $(find . -iname '*.jpg' -exec basename {} .jpg \;
             | sed -r -e 's/^([^_]+|[^_]+_[^_]+_[^_]+)_[0-9]+$/\1/'
             | sort -u); do
    find -regex ".*${F}_[0-9]*.jpg" 
       | sort -t _ -k 2 -n | sort -n -t _ -k 4 -s | head -n -1;
done

Shell Script to delete specific image-files recursively

Answers (2)

Related Questions