Reputation: 127
I want to find
some files and calculate the shasum
by using a pipe command.
find . -type f | xargs shasum
But there are files withe quotes in my directory, for example the file named
file with "special" characters.txt
The pipe output look like this:
user@home ~ $ find . -type f | xargs shasum
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty1.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty2.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty3.txt
shasum: ./file:
shasum: with: No such file or directory
shasum: special: No such file or directory
shasum: characters.txt: No such file or directory
25ea78ccd362e1903c4a10201092edeb83912d78 ./file1.txt
25ea78ccd362e1903c4a10201092edeb83912d78 ./file2.txt
The quotes within the filename makes problems.
How can I tell shasum
to process the files correctly?
Upvotes: 1
Views: 333
Reputation: 295403
The short explanation is that xargs
is widely considered broken-by-design, unless using extensions to the standard that disable its behavior of trying to parse and honor quote and escaping content in its input. See the xargs
section of UsingFind for more details.
On a system with GNU or modern BSD extensions (including MacOS X), you can (and should) NUL-delimit the output from find
:
find . -type f -print0 | xargs -0 shasum --
find -exec
That said, you can do even better by getting xargs
out of the loop entirely in a way that's fully compliant with modern (~2006) POSIX:
find . -type f -exec shasum -- '{}' +
Note that the --
argument specifies to shasum
that all future arguments are filenames. If you'd used find * -type f ...
, then you could have a result starting with a dash; using --
ensures that this result isn't interpreted as a set of options.
If you have GNU xargs, but don't have the option of a NUL-delimited input stream, then xargs -d $'\n'
(in shells such as bash with ksh extensions) will avoid the quoting and escaping behavior:
xargs -d $'\n' shasum -- <files.txt
However, this is suboptimal, because newline literals are actually possible inside filenames, thus making it impossible to distinguish between a newline that separates two names and a newline that is part of an actual name. Consider the following scenario:
mkdir -p ./file.txt$'\n'/etc/passwd$'\n'/
touch ./file.txt$'\n'/etc/passwd$'\n'file.txt file.txt
find . -type f | xargs -d $'\n' shasum --
This will have output akin to the following:
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./file.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./file.txt
c0c71bac843a3ec7233e99e123888beb6da8fbcf /etc/passwd
da39a3ee5e6b4b0d3255bfef95601890afd80709 file.txt
...thus allowing an attacker who can control filenames to cause a shasum for an arbitrary file outside the intended directory structure to be added to your output.
Upvotes: 5