Konrad
Konrad

Reputation: 73

Using grep in while loop breaks the loop

I want to write a script in bash that prints the least repeating line of standard input

I wrote this code:

#!/bin/bash
var=1000
while read line
do
    tmp=$(grep -c $line)
    if [ $tmp -lt $var ]
    then
        var=$tmp
        out=$line
    fi
done
var="$var $out"
echo $var

but e.g. when using a test like this

id1
id2
id3
id1
square
id1
id2
id3
id1
circle
id2
id2

the program only enters the loop once thus it gives a bad output

3 id1

when the correct one should be

1 square

This line

tmp=$(grep -c $line)

seems to be breaking the loop but I can't find out why. Is there any way to bypass using grep in my code or any other way to fix my script?

Upvotes: 1

Views: 1459

Answers (2)

tripleee
tripleee

Reputation: 189357

The grep command reads the remainder of standard input. You will need to copy the input to a temp file if you want to both grep it and do something else with it.

A much simpler solution to your problem is

uniq -d | tail -n 1

More generally, running grep on each line in a loop over a file is at antipattern which often suggests moving to Awk or sed instead, if you can't find a simple pipeline with standard tools to accomplish your goal.

Upvotes: 0

jil
jil

Reputation: 2691

The problem in your code is that this grep

    tmp=$(grep -c $line)

will read from stdin and thus consume all the lines on the very first round the while loop is executed. I.e. first you will read the first line into $line. Then you will grep for this string in the rest of the stdin.

You could fix your code by using a temporary file, e.g.:

#!/bin/bash
tmpfile=$(mktemp)
cat > "$tmpfile"
min=0
while IFS= read -r line; do
    count=$(grep -c "$line" $tmpfile)
    if (( min == 0 || (count < min) )); then
        min=$count
        out="$min $line"
    fi
done < <(sort -u "$tmpfile")
rm "$tmpfile"
echo "$out"

But this is of course quite horrible solution as it uses temporary file and opens the input file many times. Better would be to use something like:

#!/bin/bash
sort | uniq -c | sort -n | head -1

Upvotes: 2

Related Questions