Reputation: 2592

Bash Process Substitution usage with tee and while loop

I want to use nested process subtitution with tee in a while loop.

while read line; do
  #process line
  echo "--$line"
done < <(cat sample.file | tee >(grep "SPECLINE") | grep "LINESTOPROCESS")

Therefore, I need:

all lines in sample.file that contain "LINETOPROCESS" expression should be passed into the loop, and they will be printed with "--" prefix.
all lines contain "SPECLINE" needs to be printed in tee's first process substitution (in the grep).

I want to avoid cat-ting the sample.file more than once as it is too large and heavy.

With a simple sample.test file:

line1 SPECLINE
line2 LINETOPROCESS
line3 LINETOPROCESS
line4 SPECLINE
line5 I don't need it
line6 also not
line7 also not
line8 SPECLINE
line9 LINETOPROCESS

My result:

# ./test.sh
#

My desired result:

# ./test.sh
line1 SPECLINE 
--line2 LINETOPROCESS
--line3 LINETOPROCESS
line4 SPECLINE
line8 SPECLINE
--line9 LINETOPROCESS

Or I can also accept this as output:

# ./test.sh
--line2 LINETOPROCESS
--line3 LINETOPROCESS
--line9 LINETOPROCESS
line1 SPECLINE 
line4 SPECLINE
line8 SPECLINE

UPDATE1

greps are for demo only. I really need those 2 substitutions.

sample.file is a http file.
grep "SPECLINE" would be "hxselect -i -s ';' -c 'div.hour'
grep "LINESTOPROCESS" would be "hxselect -i -s ';' -c 'div.otherclass' | hxpipe

hx programs are not line-oriented. They are reading from stdin and outputting to stdout.

Therefore the tee's first command will select divs with 'hour' class and separate them with ';'. Afterwards, the pipe after tee will select all divs with class 'otherclass' and hxpipe will flatten it for the loop for further processing.

Upvotes: 0

Answers (3)

Walter A

Reputation: 20012

When you want the tee, you can make 2 changes.
Your testcode greps LINESTOPROCESS, the input is LINETO..
The output process substition gives problems like https://stackoverflow.com/a/42766913/3220113 explained. You can do this differently.

while IFS= read -r line; do
  #process line
  echo "--$line"
done < x2 |
tee >(grep "SPECLINE") >(grep "LINETOPROCESS") >/dev/null

I don't know hxselect, but it seems to operate on a complete well-formed XML document, so avoid the grep.

Upvotes: 0

Ralf

Reputation: 1813

The following just loops through the entire file and just prints the matching lines. All other lines are ignored.

while read line; do
    case "$line" in
        *SPECLINE*) echo "$line" ;;
        *LINETOPROCESS*) echo "--$line" ;;
    esac
done < sample.file

Upvotes: 0

chepner

Reputation: 531345

I would use no process substitution at all.

while IFS= read -r line; do
  if [[ $line = *SPECLINE* ]]; then
    printf '%s\n' "$line"
  elif [[ $line = *LINETOPROCESS* ]]; then
    printf '--%s\n' "$line"
  fi
done < sample.txt

You are already paying the cost of reading an input stream line-by-line in bash; no reason to add the overhead of two separate grep processes to it.

A single awk process would be even better, as it is more efficient than bash's read-one-character-at-a-time approach to reading lines of text.

awk '/SPECLINE/ {print} /LINETOPROCESS/ {print "--"$0}' sample.txt

(which is too simple if a single line could match both SPECLINE and LINETOPROCESS, but I leave that as an exercise to the reader to fix.)

Upvotes: 4

Bash Process Substitution usage with tee and while loop

Answers (3)

Related Questions