zomega
zomega

Reputation: 2316

Detect if piped command contains string in bash

I have a bash script with one line looking like this:

Command1 | Command2 | Command3

Command1 produces output and the other commands (Command2 and Command3) filter the output.

The filtering happens in real time line by line (using sed). I cannot wait for Command1 to finish before filtering.

I want to know if the output of Command1 contains a string (e.g. "foo bar\n\n"). I want to know this when Command1 finished. As you can see the string I'm looking for is multiple lines long.

Is this possible?

Upvotes: 0

Views: 213

Answers (1)

Charles Duffy
Charles Duffy

Reputation: 295619

Scanning from a Process Substitution

If you create a shell function testcmd that checks for your string and takes an action when it's seen (note that this action can be something like running a program or creating a file; any variable that you set won't be visible to the parent shell that launched the pipeline), this would look like:

findit() {
  awk '
    BEGIN { rc=1 }  # starting: unless we find the pattern, exit w/ an error
    $0 ~ /foo bar$/ && rc == 1 { rc=0; next } # found the first line
    $0 == ""        && rc == 0 { exit }       # success; found our pattern
    { rc=1 }        # reset: saw a line that did not trigger a next or exit above
    END { exit(rc) }                          # honor rc as exit status
  '
}
testcmd() { findit && { echo "Found the pattern" >&2; touch found; }; }
Command1 | tee >(testcmd) | Command2 | Command3

This code will create the found file the moment foo bar\n\n (assuming the \ns are meant to represent literal newlines) is seen in the output of Command1, without waiting for Command2 or Command3 (or even waiting for Command1 to finish, if it has more output after this string is emitted).

The >(...) syntax is a process substitution, which provides a filename that can be used to write to the stdin of the compound command in ....

This doesn't interrupt operation of Command2 and Command3 because tee ignores an output that stops reading early and continues passing content to its other outputs as long as more input is available and at least one output is accepting writes.


Testing The Above

To test the above logic, we can define our functions as follows:

Command1() {
  printf '%s\n' 'first line' 'prefix foo bar' '' 'last line'
}

Command2() {
  echo "command2 reading" >&2
  in=$(cat)

  sleep 1

  echo "command2 writing" >&2
  printf '%s\n' "$in"
}

Command3() { echo "command3 read $(wc -l) lines"; }

...with which our combined output from Command1 | tee >(testcmd) | Command2 | Command3 is:

command2 reading
Found the pattern
command2 writing
command3 read 4 lines

(you could potentially have Found the pattern before command2 reading -- the ordering between those is undefined; but the point is that Found the pattern happens before command2 writing, and also before command3's completion).

Upvotes: 1

Related Questions