Reputation: 4343
I have a txt file which contains list of file names
Example:
10.jpg
11.jpg
12.jpeg
...
In a folder this files should protect from delete process and other files should delete.
So i want oppposite logic of this question: Shell command/script to delete files whose names are in a text file
How to do that?
Upvotes: 2
Views: 388
Reputation: 8406
Provided there's no spaces or special escaped chars in the file names, either of these (or variations of these) would work:
rm -v $(stat -c %n * | sort excluded_file_list | uniq -u)
stat -c %n * | grep -vf excluded_file_list | xargs rm -v
Upvotes: 1
Reputation: 2308
Use extglob
and Bash extended pattern matching !(pattern-list)
:
!(pattern-list)
Matches anything except one of the given patterns
where a pattern-list is a list of one or more patterns separated by a |.extglob
If set, the extended pattern matching features described above are enabled.
So for example:
$ ls
10.jpg 11.jpg 12.jpeg 13.jpg 14.jpg 15.jpg 16.jpg a.txt
$ shopt -s extglob
$ shopt | grep extglob
extglob on
$ cat a.txt
10.jpg
11.jpg
12.jpeg
$ tr '\n' '|' < a.txt
10.jpg|11.jpg|12.jpeg|
$ ls !(`tr '\n' '|' < a.txt`)
13.jpg 14.jpg 15.jpg 16.jpg a.txt
The deleted files are 13.jpg 14.jpg 15.jpg 16.jpg a.txt
according to the example.
So with extglob
and !(pattern-list)
, we can obtain the files which are excluded based on the file content.
Additionally, if you want to exclude the entries starting with .
, then you could switch on the dotglob
option with shopt -s dotglob
.
Upvotes: 3
Reputation: 6335
This is one way that will work with bash GLOBIGNORE:
$ cat file2
10.jpg
11.jpg
12.jpg
$ ls *.jpg
10.jpg 11.jpg 12.jpg 13.jpg
$ echo $GLOBIGNORE
$ GLOBIGNORE=$(tr '\n' ':' <file2 )
$ echo $GLOBIGNORE
10.jpg:11.jpg:12.jpg:
$ ls *.jpg
13.jpg
As it is obvious, globing ignores whatever (file, pattern, etc) is included in the GLOBIGNORE bash variable.
This is why the last ls
reports only file 13.jpg
since files 10,11 and 12.jpg are ignored.
As a result using rm *.jpg
will remove only 13.jpg
in my system:
$ rm -iv *.jpg
rm: remove regular empty file '13.jpg'? y
removed '13.jpg'
When you are done, you can just set GLOBIGNORE to null:
$ GLOBIGNORE=
Worths to be mentioned, that in GLOBIGNORE you can also apply glob patterns instead of single filenames, like *.jpg
or my*.mp3
, etc
Alternative :
We can use programming techniques (grep, awk, etc) to compare the file names present in ignorefile and the files under current directory:
$ awk 'NR==FNR{f[$0];next}(!($0 in f))' file2 <(find . -type f -name '*.jpg' -printf '%f\n')
13.jpg
$ rm -iv "$(awk 'NR==FNR{f[$0];next}(!($0 in f))' file2 <(find . -type f -name '*.jpg' -printf '%f\n'))"
rm: remove regular empty file '13.jpg'? y
removed '13.jpg'
Note: This also makes use of bash process substitution, and will break if filenames include new lines.
Upvotes: 2
Reputation: 25419
Another alternative to George Vasiliou's answer would be to read the file with the names of the files to keep using the Bash builtin mapfile
and then check for each of the files to be deleted whether it is in that list.
#! /bin/bash -eu
mapfile -t keepthose <keepme.txt
declare -a deletethose
for f in "$@"
do
keep=0
for not in "${keepthose[@]}"
do
[ "${not}" = "${f}" ] && keep=1 || :
done
[ ${keep} -gt 0 ] || deletethose+=("${f}")
done
# Remove the 'echo' if you really want to delete files.
echo rm -f "${deletethose[@]}"
The -t
option causes mapfile
to trim the trailing newline character from the lines it reads from the file. No other white-space will be trimmed, though. This might be what you want if your file names actually contain white-space but it could also cause subtle surprises if somebody accidentally puts a space before or after the name of an important file they want to keep.
Note that I'm first building a list of the files that should be deleted and then delete them all at once rather than deleting each file individually. This saves some sub-process invocations.
The lookup in the list, as coded above, has linear complexity which gives the overall script quadratic complexity (precisely, N × M where N is the number of command-line arguments and M the number of entries in the keepme.txt
file). If you only have a few dozen files, this should be fine. Unfortunately, I don't know of a better way to check for set membership in Bash. (We cannot use the file names as keys in an associative array because they might not be proper identifiers.) If you are concerned with performance for many files, using a more powerful language like Python might be worth consideration.
I would also like to mention that the above example simply compares strings. It will not realize that important.txt
and ./important.txt
are the same file and hence delete the file. It would be more robust to convert the file name to a canonical path using readlink -f
before comparing it.
Furthermore, your users might want to be able to put globing patterns (like important.*
into the list of files to keep. If you want to handle those, extra logic would be required.
Overall, specifying what files to not delete seems a little dangerous as the error is on the bad side.
Upvotes: 1