IMTheNachoMan
IMTheNachoMan

Reputation: 5811

Capture output of piped command while still knowing if first command wrote to stderr

Is it possible to capture the output of cmd2 from cmd1 | cmd2 while still knowing if cmd1 wrote to stderr?

I am using exiftool to strip exif data from files:

exiftool "/path/to/file.ext" -all= -o -

This writes the output to stdout. This works for most files. If the file is corrupt or not a video/image file it will not write anything to stdout and, instead, write an error to stderr. For example:

Error: Writing of this type of file is not supported - /path/to/file.ext

I ultimately need to capture the md5 of files that don't result in an error. Right now I am doing this:

md5=$(exiftool "/path/to/file.ext" -all= -o - | md5sum | awk '{print $1}')

Regardless if the file is a image/video, it'll calculate an md5.

If the file is an image/video, it'll capture the file's md5 as expected.

If the file is not an image/video, exiftool doesn't write anything to stdout and so md5sum calculates the md5 of the null input. But that line will also write an error to stderr.

I need to be able to check if something was written to stderr so I know to scrap the calculated md5.

I know one alternative is to run the exiftool twice: one time without the md5sum and without capturing to see if anything was written to stderr and then a second time with the md5sum and capturing. But this means I have to run exiftool twice. I want to avoid that because it can take a long time for big files. I'd rather only run it once.

Update

Also, I can't capture the output of just exiftool because it yields this error:

bash: warning: command substitution: ignored null byte in input

And I cannot ignore this error because the md5 result is not the same. That is to say:

file=$(exiftool "/path/to/file.ext" -all= -o -)
echo "$file" | md5sum

Will print the above null byte error and will not have the same md5 result as:

exiftool "/path/to/file.ext" -all= -o - | md5sum

Upvotes: 2

Views: 461

Answers (4)

Ivan
Ivan

Reputation: 7277

There is a special var(array) for this PIPESTATUS, simple example, file and file2 exist

$ ls file &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[@]}
0 0

And here file3 not exist

$ ls file3 &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[@]}
2 0

$ ls file3; echo $?
ls: cannot access 'file3': No such file or directory
2

Triple pipe

$ ls file 2> /dev/null | ls file3 &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[@]}
0 2 0

Pipe in var tested with grep

$ test=$(ls file | grep .; ((${PIPESTATUS[1]} > 0)) && echo error)
$ echo $test
file

$ test=$(ls file3 | grep .; ((${PIPESTATUS[1]} > 0)) && echo error)
ls: cannot access 'file3': No such file or directory
$ echo $test
error

Another approach is to check that file type is image or video first.

type=$(file "/path/to/file.ext")
case $type in
    *image*|*Media*) echo "is an image or video";;
esac

Upvotes: 3

Charles Duffy
Charles Duffy

Reputation: 295403

A coprocess can be used for this:

#!/usr/bin/env bash
case $BASH_VERSION in [0-3].*) echo "ERROR: Bash 4+ required" >&2; exit 1;; esac

coproc STDERR_CHECK { seen=0; while IFS= read -r; do seen=1; done; echo "$seen"; }
{
  md5=$(exiftool "/path/to/file.ext" -all= -o - | md5sum | awk '{print $1}')
} 2>&${STDERR_CHECK[1]}
exec {STDERR_CHECK[1]}>&-
read stderr_seen <&"${STDERR_CHECK[0]}"

if (( stderr_seen )); then
  echo "exiftool emitted stdout with md5 $md5, and had content on stderr"
else
  echo "exiftool emitted stdout with md5 $md5, and did not emit any content on stderr"
fi

Upvotes: 2

William Pursell
William Pursell

Reputation: 212248

Just capture the output, and then conditionally write it. eg:

if out="$(exiftool "/path/to/file.ext" -all= -o - )"; then
    md5=$(echo "$out" | md5sum | awk '{print $1}'))
fi

This makes the assignment to md5 and returns the exit status of exiftool, which is checked by the if. Note that this construction assumes that exiftool returns a reasonable exit value.

Upvotes: 0

vdavid
vdavid

Reputation: 2544

md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep -q .)

This opens file descriptor 3 and redirects it to file descriptor 1 (a.k.a. stdout).

The trick is to redirect exiftool outputs:

  • exiftool ... 2>&1 tells that file descriptor 2 (a.k.a. stderr) is redirected to stdout
  • exiftool ... 1>&3 tells that stdout is redirected to file descriptor 3 which, at this moment, is redirected to stdout

Then fd 3 is redirected to another chain of commands using process substitution, i.e. 3> >(md5sum | awk '{print $1}' >&3) where 3> tells to redirect fd3 and >(...) is the process substitution itself.

At the same time, the standard error of exiftool is written to the standard output which is piped into grep -q . which will return 0 if there is at least one character.

Because grep -q . is the last command executed in the main chain of commands, you can simply check the results of $?:

md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep -q .)
if [ $? -eq 0 ]
then
  # something was written to exiftool's stderr
fi

The error will not be written. If you want to see the error but not capture it in md5 then replace grep -q . by grep . >&2

md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep . >&2)

It is very important that you redirect exiftool outputs in this very order. If you redirected like this:

exiftool "/path/to/file.ext" -all= -o - 1>&3 2>&1

Then stdout is redirected to fd3 and then stderr is redirected to stdout. But because 1>&3 occurs before 2>&1 then stderr will be redirected to stdout which is redirected to fd3 at this time. You definitely don’t want that.

The end of the process substitution chain writes to fd3 with >&3 because you want to keep the result to fd3. Without >&3, the result of awk would end up in fd1 which would be piped to grep -q . or grep . >&2 and, again, you definitely don’t want that.

PS. you don’t need to close fd3 because it was opened during a subprocess when assigning md5. Should you need to close the file descriptor, please call exec 3>&-

Upvotes: 2

Related Questions