ddn
ddn

Reputation: 429

How to delete one set of files in a directory containing similarly named files?

A series of several hundred directories contains files in the following pattern:

Dir1:
-text_76.txt  
-text_81.txt   
-sim_76.py   
-sim_81.py

Dir2:
-text_90.txt  
-text_01.txt   
-sim_90.py   
-sim_01.py

Within each directory, the files beginning with text or sim are essentially duplicates of the other text or sim file, respectively. Each set of duplicate files has a unique numerical identifier. I only want one set per directory. So, in Dir1, I would like to delete everything in the set labeled either 81 OR 76, with no preference. Likewise, in Dir2, I would like to delete either the set labeled 90 OR 01. Each directory contains exactly two sets, and there is no way to predict the random numerical IDs used in each directory. How can I do this?

Upvotes: 0

Views: 54

Answers (2)

ebyrob
ebyrob

Reputation: 674

Assuming you always have 1 known file, say text_xx.txt then you could run this script in each sub-directory:

ls text_*.txt | { read first; rm *"${first:4:4}"*; };

This will list all files matching the wildcard pattern text_*.txt. Using read takes only the first matching result of the ls command. This will result in a $first shell variable containing one fully expanded match: text_xx.txt. After that ${first:4:4} sub-strings this fully expanded match to get the characters _xx. by knowing the length of test_ and xx. Finally, rm *""* appends wild cards to the search result and executes it as a command: rm *_xx.*.

I chose to include _ and . around xx to be a bit conservative about what gets deleted.

If the length of xx is not known, things gets a bit more complicated. A safer command unsure of this length might be:

ls text_??.txt | { read first; rm *_"${first:5:2}".*; };

This should remove one "fileset" every time it is run in a given sub-directory. If there is only 1 fileset, it would still remove the fileset.

Edit: Simplified to remove unnecessary use of IFS command.

Edit: Attempt to expand on and clarify the explanation.

Upvotes: 1

Kaizhe Huang
Kaizhe Huang

Reputation: 1006

ls | grep -P "*[81|76]*" | xargs -d"\n" rm
ls | grep -P "*[90|01]*" | xargs -d"\n" rm

How it works:

ls lists all files (one by line since the result is piped).

grep -P filter

xargs -d"\n" rm executes rm line once for every line that is piped to it.

Upvotes: 0

Related Questions