amindfv
amindfv

Reputation: 8448

File names with spaces as bulk arguments

If I have a list of files, some of which have spaces in their names, generated from, say:

find . -iname "*hs" | grep foo

How can I pass them as arguments to a single process like process name1 name2 ...?

Note that I want to pass all file names as individual arguments to a single process, so standard for f in * solutions won't work

Upvotes: 2

Views: 179

Answers (1)

Jonathan Leffler
Jonathan Leffler

Reputation: 754440

find . -iname "*hs*" -exec grep foo {} +

find . -iname "*hs*" -print0 | xargs -0 grep foo

Both work; the first might even be a little more efficient, unless the + option groups fewer files into a command line than xargs does.

If you're generating the file names via shell globbing, then:

grep foo *hs*

preserves the spacing in file names. Using ls to generate the names is very problematic.


Filtering names

The grep command was meant to filter the list of filenames before the bulk process, not to search the contents of the files themselves.

So you want names that match both 'hs' and 'foo'? In that case, you are best off using find still:

find . \( -iname "*hs*" -a -iname "*foo*" \) -exec grep foo {} +

Use the boolean capabilities of find.

GNU grep extensions -Z and -z

If you can't do that (your grep regex needs to be too complex for find), then you're in difficulties unless you've got a (hypothetical?) version of grep that will read 'lines' delimited by nulls.

-Z, --null
Output a zero byte (the ASCII NUL character) instead of the character that normally follows a file name. For example, grep -lZ outputs a zero byte after each file name instead of the usual newline. This option makes the output unambiguous, even in the presence of file names containing unusual characters like newlines. This option can be used with commands like find -print0, perl -0, sort -z, and xargs -0 to process arbitrary file names, even those that contain newline characters.

This is almost what's wanted, but not quite. The POSIX 2008 getdelim() function would be the tool to use; add a -z option to grep for the purpose, and then use grep -lzZ .... to filter the find ... -print0 data before it is fed to xargs -0.

The quote above is from the manual page on Mac OS X 10.7.5, and the GNU grep is version 2.5.1. Maybe a more recent version of GNU grep is better equipped to help?

And lo and behold, GNU grep 2.14 supports the requisite option:

-z, --null-data
Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. Like the -Z or --null option, this option can be used with commands like ‘sort -z’ to process arbitrary file names.

Run a script from find

The other option, not to be ignored, is to create a script to be run by the find command:

find . -iname "*hs*" -exec ./list-foo + 

Where the list-foo script might be:

for arg in "$@"
do
    case "$arg" in
    (*foo*) echo "$arg";;   # This will still cause problems
    esac
done

This identifies the files; the echo is sub-optimal. Maybe you need to capture the names in an array, and then invoke the final command using the array:

array=( )
i=0
for arg in "$@"
do
    case "$arg" in
    (*foo*) array[$((i++))]="$arg";;
    esac
done

if [ "$i" -gt 0 ]
then real_work "${array[@]}"
fi

Where real_work is the program (script?) that does the real work. You can throw list-foo away once you're done with it, unless you're going to be doing the same filtering job over and over.


Globbing names that contain a space

You can glob names that contain a space by escaping the space:

rm -i -- *\ *

This expands to just the file names with spaces in the name, which helps in cleaning up a directory with file names containing files created for testing the answers to this question.

Upvotes: 5

Related Questions