user121196
user121196

Reputation: 30990

Counting unique strings where there's a single string per line in bash

Given input file

z
b
a
f
g
a
b
...

I want to output the number of occurrences of each string, for example:

z 1
b 2
a 2
f 1
g 1

How can this be done in a bash script?

Upvotes: 1

Views: 1351

Answers (6)

Balaswamy Vaddeman
Balaswamy Vaddeman

Reputation: 8530

You can use sort filename | uniq -c.

Have a look at the Wikipedia page on uniq.

Upvotes: 0

potong
potong

Reputation: 58371

This might work for you:

cat -n file | 
sort -k2,2 | 
uniq -cf1 | 
sort -k2,2n | 
sed 's/^ *\([^ ]*\).*\t\(.*\)/\2 \1/'

This output the number of occurrences of each string in the order in which they appear.

Upvotes: 0

Mat
Mat

Reputation: 206689

Here's a bash-only version (requires bash version 4), using an associative array.

#! /bin/bash

declare -A count
while read val ; do
    count[$val]=$(( ${count[$val]} + 1 ))
done < your_intput_file # change this as needed

for key in ${!count[@]} ; do
    echo $key ${count[$key]}
done

Upvotes: 1

johnsyweb
johnsyweb

Reputation: 141790

You can sort the input and pass to uniq -c:

$ sort input_file | uniq -c
 2 a
 2 b
 1 f
 1 g
 1 z

If you want the numbers on the right, use awk to switch them:

$ sort input_file | uniq -c | awk '{print $2, $1}'
a 2
b 2
f 1
g 1
z 1

Alternatively, do the whole thing in awk:

$ awk '
{
    ++count[$1]
}
END {
    for (word in count) {
        print word, count[word]
    }
}
' input_file
f 1
g 1
z 1
a 2
b 2

Upvotes: 4

Mithrandir
Mithrandir

Reputation: 25337

Try:

awk '{ freq[$1]++; } END{ for( c in freq ) { print c, freq[c] } }' test.txt

Where test.txt would be your input file.

Upvotes: 1

shadyabhi
shadyabhi

Reputation: 17234

cat text | sort | uniq -c

should do the job

Upvotes: 1

Related Questions