Reputation: 429
A series of several hundred directories contains files in the following pattern:
Dir1:
-text_76.txt
-text_81.txt
-sim_76.py
-sim_81.py
Dir2:
-text_90.txt
-text_01.txt
-sim_90.py
-sim_01.py
Within each directory, the files beginning with text or sim are essentially duplicates of the other text or sim file, respectively. Each set of duplicate files has a unique numerical identifier. I only want one set per directory. So, in Dir1, I would like to delete everything in the set labeled either 81 OR 76, with no preference. Likewise, in Dir2, I would like to delete either the set labeled 90 OR 01. Each directory contains exactly two sets, and there is no way to predict the random numerical IDs used in each directory. How can I do this?
Upvotes: 0
Views: 54
Reputation: 674
Assuming you always have 1 known file, say text_xx.txt
then you could run this script in each sub-directory:
ls text_*.txt | { read first; rm *"${first:4:4}"*; };
This will list all files matching the wildcard pattern text_*.txt
. Using read
takes only the first matching result of the ls
command. This will result in a $first
shell variable containing one fully expanded match: text_xx.txt
. After that ${first:4:4}
sub-strings this fully expanded match to get the characters _xx.
by knowing the length of test_
and xx
. Finally, rm *""*
appends wild cards to the search result and executes it as a command: rm *_xx.*
.
I chose to include _
and .
around xx
to be a bit conservative about what gets deleted.
If the length of xx
is not known, things gets a bit more complicated. A safer command unsure of this length might be:
ls text_??.txt | { read first; rm *_"${first:5:2}".*; };
This should remove one "fileset" every time it is run in a given sub-directory. If there is only 1 fileset, it would still remove the fileset.
Edit: Simplified to remove unnecessary use of IFS command.
Edit: Attempt to expand on and clarify the explanation.
Upvotes: 1
Reputation: 1006
ls | grep -P "*[81|76]*" | xargs -d"\n" rm
ls | grep -P "*[90|01]*" | xargs -d"\n" rm
How it works:
ls lists all files (one by line since the result is piped).
grep -P filter
xargs -d"\n" rm executes rm line once for every line that is piped to it.
Upvotes: 0