The_Questioner
The_Questioner

Reputation: 312

bash - multiple if clauses. What's wrong with this code?

I'm practicing writing small bash scripts. I have a habit of accidentally typing "teh" instead of "the". So now I want to start at a directory and go through all the files there to find how many occurrences of "teh" there are.

Here's what I have:

    #!/bin/bash

    for file in `find` ; do
     if [ -f "${file}" ] ; then
      for word in `cat "${file}"` ; do
       if [ "${word}" = "teh" -o "${word}" = "Teh" ] ; then
        echo "found one"
       fi    
      done
     fi
    done

When I run it, I get

    found one
    found one
    found one
    found one
    found one
    found one
    ./findTeh: line 6: [: too many arguments

Why am I getting too many arguments. Am I not doing the if statement properly?

Thanks

Upvotes: 2

Views: 75

Answers (2)

Charles Duffy
Charles Duffy

Reputation: 295325

The behavior of test with more than three arguments is, per POSIX, not well-defined and deprecated. That's because variable arguments can be treated as logical constructs with meaning to the test command itself in that case.

Instead, use the following, which will work on all POSIX shells, not only bash:

if [ "$word" = teh ] || [ "$word" = Teh ]; then
  echo "Found one"
fi

...or, similarly, a case statement:

case $word in
  [Tt]eh) echo "Found one" ;;
esac

Now, let's look at the actual standard underlying the test command:

>4 arguments:

The results are unspecified.

[OB XSI] [Option Start] On XSI-conformant systems, combinations of primaries and operators shall be evaluated using the precedence and associativity rules described previously. In addition, the string comparison binary primaries '=' and "!=" shall have a higher precedence than any unary primary. [Option End]

Note the OB flag: The use of a single test command with more than four arguments is obsolete, and is an optional part of the standard regardless (which not all shells are required to implement).


Finally, consider the following revision to your script, with various other bugs fixed (albeit using various bashisms, and thus not portable to all POSIX shells):

#!/bin/bash

# read a filename from find on FD 3
while IFS= read -r -d '' filename <&3; do
  # read words from a single line, as long as there are more lines
  while read -a words; do
    # compare each word in the most-recently-read line with the target
    for word in "${words[@]}"; do
      [[ $word = [Tt]eh ]] && echo "Found one"
    done
  done <"$filename"
done 3< <(find . -type f -print0)

Some of the details:

  • By delimiting filenames with NULs, this works correctly with all possible filenames, including files with spaces or newlines in their names, which for file in $(find) does not.
  • Quoted array expansion, ie. for word in "${words[@]}", avoids glob expansion; with the old code, if * were a word in a file, then the code would subsequently be iterating over filenames in the current directory rather than over words in a file.
  • Using while read -a to read in a single line at a time both avoids the aforementioned glob expansion behavior, and also acts to constrain memory usage (if very large files are in play).

Upvotes: 4

Akira
Akira

Reputation: 4071

A common idiom to avoid this sort of problem is to add "x" on both sides of comparisons:

if [ "x${word}" = "xteh" -o "x${word}" = "xTeh" ] ; then

Upvotes: 1

Related Questions