user1464409
user1464409

Reputation: 1042

How do I combine my find and grep command to delete the files?

I currently have the following command which produces a list of files:

find . -type f | grep -i -f ./remove_list

This command reads a file called "remove_list" which contains a list of terms (actually regex expressions) I want to find in the output from the find command.

The above command works fine but I don't understand how to now delete each of the files found (especially as some of them will contain spaces).

I thought I could do something like this:

find . -type f -print0 | grep -i -f ./remove_list | xargs -0 rm

As I understand it the -print0 and -0 are necessary to handle filenames with spaces in them but now when I try to run the command I get an error message stating "Binary file (standard input) matches".

How do I read in a number of (regex) terms from a file so that they can be used as arguments in the find statement?

Upvotes: 4

Views: 1977

Answers (4)

Ole Tange
Ole Tange

Reputation: 33685

If you have GNU Parallel installed:

find . -type f | grep -i -f ./remove_list | parallel rm

If it is not packaged for your system, this should install it in 10 seconds:

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

To learn more: Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial (man parallel_tutorial). You command line will love you for it.

Upvotes: 0

kojiro
kojiro

Reputation: 77059

You just need grep to output a file list as well. Grep doesn't know that these are files: It just views them as a stream of data. If the files themselves don't have newlines in their names and are generally whitespace-safe, then you can do:

find . -type f | grep -if ./remove_list | xargs rm

The -print0, --null and -0 arguments to various commands are for preventing errors due to edge-cases in file names, such as file names with multiple whitespace and newlines. The problem gets harder if you actually have to deal with those, because your grep command is trying to filter the names themselves. If you really need to do that, you may want to switch to a tool that can deal with each name individually. In shells with recursive globbing (such as bash 4):

shopt -s globstar
for f in **/*; do
    # check if "$f" is a file and grep matches its name
    if [[ -f $f ]] && grep -qif ./remove_list <<< "$f"; then
        rm "$f"
    fi
done

As always, you can mimic this in find and a standard shell with the same logic, but somewhat uglier:

find . -type f -exec bash -c 'for f; do
    if printf '%s\n' "$f" | grep -qif ./remove_list; then
        rm "$f"
    fi
done' _ {} +

Upvotes: 2

Connor
Connor

Reputation: 64644

If your first command finds all of the files, you can pass the output through a while loop to delete each file.

find . -type f | grep -i -f ./remove_list | while read line; do rm "$line"; done

Upvotes: 2

konsolebox
konsolebox

Reputation: 75458

If you're using GNU grep. You can use -Z:

   -Z, --null
          Output a zero byte (the ASCII NUL character) instead of the character that normally follows a file name.  For example,
          grep -lZ outputs a zero byte after each file name instead of the usual newline.  This option makes the output
          unambiguous, even in the presence of file names containing unusual characters like newlines.  This option can be used
          with commands like find -print0, perl -0, sort -z, and xargs -0 to process arbitrary file names, even those that contain
          newline characters.

And you also need -z for the input.

   -z, --null-data
          Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline.  Like
          the -Z or --null option, this option can be used with commands like sort -z to process arbitrary file names.

So your command may look like:

find . -type f -print0 | grep -z -Z -i -f ./remove_list | xargs -0 rm

Upvotes: 1

Related Questions