Reputation: 312
I'm practicing writing small bash scripts. I have a habit of accidentally typing "teh" instead of "the". So now I want to start at a directory and go through all the files there to find how many occurrences of "teh" there are.
Here's what I have:
#!/bin/bash
for file in `find` ; do
if [ -f "${file}" ] ; then
for word in `cat "${file}"` ; do
if [ "${word}" = "teh" -o "${word}" = "Teh" ] ; then
echo "found one"
fi
done
fi
done
When I run it, I get
found one
found one
found one
found one
found one
found one
./findTeh: line 6: [: too many arguments
Why am I getting too many arguments. Am I not doing the if statement properly?
Thanks
Upvotes: 2
Views: 75
Reputation: 295325
The behavior of test
with more than three arguments is, per POSIX, not well-defined and deprecated. That's because variable arguments can be treated as logical constructs with meaning to the test command itself in that case.
Instead, use the following, which will work on all POSIX shells, not only bash:
if [ "$word" = teh ] || [ "$word" = Teh ]; then
echo "Found one"
fi
...or, similarly, a case statement:
case $word in
[Tt]eh) echo "Found one" ;;
esac
Now, let's look at the actual standard underlying the test
command:
>4 arguments:
The results are unspecified.
[OB XSI] [Option Start] On XSI-conformant systems, combinations of primaries and operators shall be evaluated using the precedence and associativity rules described previously. In addition, the string comparison binary primaries '=' and "!=" shall have a higher precedence than any unary primary. [Option End]
Note the OB
flag: The use of a single test
command with more than four arguments is obsolete, and is an optional part of the standard regardless (which not all shells are required to implement).
Finally, consider the following revision to your script, with various other bugs fixed (albeit using various bashisms, and thus not portable to all POSIX shells):
#!/bin/bash
# read a filename from find on FD 3
while IFS= read -r -d '' filename <&3; do
# read words from a single line, as long as there are more lines
while read -a words; do
# compare each word in the most-recently-read line with the target
for word in "${words[@]}"; do
[[ $word = [Tt]eh ]] && echo "Found one"
done
done <"$filename"
done 3< <(find . -type f -print0)
Some of the details:
for file in $(find)
does not.for word in "${words[@]}"
, avoids glob expansion; with the old code, if *
were a word in a file, then the code would subsequently be iterating over filenames in the current directory rather than over words in a file.while read -a
to read in a single line at a time both avoids the aforementioned glob expansion behavior, and also acts to constrain memory usage (if very large files are in play).Upvotes: 4
Reputation: 4071
A common idiom to avoid this sort of problem is to add "x" on both sides of comparisons:
if [ "x${word}" = "xteh" -o "x${word}" = "xTeh" ] ; then
Upvotes: 1