Reputation: 19

Order lines by number of occurrences

Given a list with one element per line (with occasionally some blank lines), e.g.:

I would like to get as output a list with unique items sorted by number of occurrences:

Thanks!

Upvotes: 1

Answers (3)

Vidhya G

Reputation: 2320

You can use awk and sort. cnt uses your numbers in column 1 $1 as an index. Adds ++ 1 to value of array index $1 on each row. Pipe (|) to sort. sort column 2 (-k2) in reverse (-r)

awk '/[0-9]/ {cnt[$1]++}END{for(k in cnt) print k,"- " cnt[k]}' file.txt |sort -rk2

if you remove the /[0-9]/ you'll also get the number of blank lines as a bonus :).

If you want, you can use /^[0-9]+/ to do a full match; but, as we use $0 for counting it doesn't really matter here.

Upvotes: 1

halfflat

Reputation: 1584

The uniq command has an option -c to emit the number of consecutive occurrences it finds. The solution then is to first remove blank lines and sort the list for input to uniq -c, then sort the output on the first field, which contains the occurrences number.

Output of sed '/^\s*$/d' | sort | uniq -c | sort -k1nr is

Note the option to sort at the end: -k1nr means sort on the first field, numerically, in reverse (i.e. descending) order.

Upvotes: 1

John1024

Reputation: 113834

Numbers sorted by number of occurrences:

$ grep -vE '^$' file | sort | uniq -c | sort -rn
      6 22008
      4 6881
      2 8080
      2 56515
      2 45682

How it works

grep -vE '^$' file

Remove empty lines from file
sort | uniq -c

Sort the numbers then print unique ones with a count of their occurrences.
sort -rn

Sort numerically in declining order by occurrence count.

Upvotes: 2

Order lines by number of occurrences

Answers (3)

How it works

Related Questions