Reputation: 1042
I currently have the following command which produces a list of files:
find . -type f | grep -i -f ./remove_list
This command reads a file called "remove_list" which contains a list of terms (actually regex expressions) I want to find in the output from the find command.
The above command works fine but I don't understand how to now delete each of the files found (especially as some of them will contain spaces).
I thought I could do something like this:
find . -type f -print0 | grep -i -f ./remove_list | xargs -0 rm
As I understand it the -print0 and -0 are necessary to handle filenames with spaces in them but now when I try to run the command I get an error message stating "Binary file (standard input) matches".
How do I read in a number of (regex) terms from a file so that they can be used as arguments in the find statement?
Upvotes: 4
Views: 1977
Reputation: 33685
If you have GNU Parallel installed:
find . -type f | grep -i -f ./remove_list | parallel rm
If it is not packaged for your system, this should install it in 10 seconds:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
To learn more: Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial (man parallel_tutorial). You command line will love you for it.
Upvotes: 0
Reputation: 77059
You just need grep
to output a file list as well. Grep doesn't know that these are files: It just views them as a stream of data. If the files themselves don't have newlines in their names and are generally whitespace-safe, then you can do:
find . -type f | grep -if ./remove_list | xargs rm
The -print0
, --null
and -0
arguments to various commands are for preventing errors due to edge-cases in file names, such as file names with multiple whitespace and newlines. The problem gets harder if you actually have to deal with those, because your grep
command is trying to filter the names themselves. If you really need to do that, you may want to switch to a tool that can deal with each name individually. In shells with recursive globbing (such as bash 4):
shopt -s globstar
for f in **/*; do
# check if "$f" is a file and grep matches its name
if [[ -f $f ]] && grep -qif ./remove_list <<< "$f"; then
rm "$f"
fi
done
As always, you can mimic this in find
and a standard shell with the same logic, but somewhat uglier:
find . -type f -exec bash -c 'for f; do
if printf '%s\n' "$f" | grep -qif ./remove_list; then
rm "$f"
fi
done' _ {} +
Upvotes: 2
Reputation: 64644
If your first command finds all of the files, you can pass the output through a while loop to delete each file.
find . -type f | grep -i -f ./remove_list | while read line; do rm "$line"; done
Upvotes: 2
Reputation: 75458
If you're using GNU grep. You can use -Z
:
-Z, --null Output a zero byte (the ASCII NUL character) instead of the character that normally follows a file name. For example, grep -lZ outputs a zero byte after each file name instead of the usual newline. This option makes the output unambiguous, even in the presence of file names containing unusual characters like newlines. This option can be used with commands like find -print0, perl -0, sort -z, and xargs -0 to process arbitrary file names, even those that contain newline characters.
And you also need -z
for the input.
-z, --null-data Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. Like the -Z or --null option, this option can be used with commands like sort -z to process arbitrary file names.
So your command may look like:
find . -type f -print0 | grep -z -Z -i -f ./remove_list | xargs -0 rm
Upvotes: 1