ylluminate
ylluminate

Reputation: 12369

Why would xargs split input on spaces and how to resolve it?

In the following bash script, I'm capturing file a file list from a path into a variable and then passing it on into xargs for further operations.

I've found that simply echoing the variable gives each line with spaces appropriately with a newline terminator for each line. However when I printf or echo this over to xargs, I'm finding that xargs appears to be splitting the input of each line by spaces as well. I'll illustrate with the following example with comments including the result I'm seeing:

# Using GNU find:
list="$( find '$SOME_PATH' -type f )"

excluded_list="$( egrep -v -f FILE_WITH_PATTERNS_OF_FOLDERS_TO_EXCLUDE <<< $list )"

# This prints out just fine with lines such as "/some/path/here with spaces" on their own line, eg:
#   /some/path/here with spaces
#   /another/path/here with spaces
#   /and yet another/path/here with spaces
echo "$excluded_list"

# But this prints out a line such as the above example "/some/path/here with spaces" broken up like this instead:
#   /some/path/here 
#   with 
#   spaces
#   /another/path/here 
#   with 
#   spaces
#   /and 
#   yet
#   another/path/here 
#   with 
#   spaces
printf "%s" "$excluded_list" | xargs -n 1 -P 1 sh -c 'echo "$0"'
# And the same result as `printf` above:
echo "$excluded_list" | xargs -n 1 -P 1 sh -c 'echo "$0"'

Upvotes: 0

Views: 571

Answers (1)

tshiono
tshiono

Reputation: 22012

It is an antipattern to assign multiple filenames into a single variable because any special character except for a null byte may appear in the filename and you cannot split the variable into original filenames again.

In your example echo "$excluded_list" may look like preserving the original filenames but unfortunately it isn't. Try to insert two or more successive whitespaces in your pathname and see what happens.
As a first aid, you can wrap $list with double quotes as <<< "$list" but it is no more than a provisional remedy.

The workarounds are:

  1. To store the filenames in an array
  2. To use a null byte as a delimiter and process the result via pipe

For instance you can say something like:

while IFS= read -r -d "" f; do
    excluded_list+=("$f")
done < <(find "$SOME_PATH" -type f -print0 | egrep -v -z -Z -f FILE_WITH_PATTERNS_OF_FOLDERS_TO_EXCLUDE -)
for f in "${excluded_list[@]}"; do
    echo "$f"
done

or

find "$SOME_PATH" -type f -print0 | egrep -v -z -Z -f FILE_WITH_PATTERNS_OF_FOLDERS_TO_EXCLUDE - | xargs -0 -n 1 -P 1 sh -c 'echo "$0"'

Note that -z and -Z options are GNU grep's extensions and may not work on other platform.

Upvotes: 2

Related Questions