Reputation: 19
Given a list with one element per line (with occasionally some blank lines), e.g.:
22008
6881
6881
22008
6881
22008
22008
6881
56515
8080
8080
56515
22008
45682
45682
22008
I would like to get as output a list with unique items sorted by number of occurrences:
22008 - 6
6881 - 4
8080 - 2
45682 - 2
56515 - 2
Thanks!
Upvotes: 1
Views: 122
Reputation: 2320
You can use awk
and sort
. cnt uses your numbers in column 1 $1
as an index. Adds ++
1 to value of array index $1 on each row. Pipe (|
) to sort
. sort
column 2 (-k2
) in reverse (-r
)
awk '/[0-9]/ {cnt[$1]++}END{for(k in cnt) print k,"- " cnt[k]}' file.txt |sort -rk2
if you remove the /[0-9]/
you'll also get the number of blank lines as a bonus :).
If you want, you can use /^[0-9]+/
to do a full match; but, as we use $0
for counting it doesn't really matter here.
Upvotes: 1
Reputation: 1584
The uniq
command has an option -c
to emit the number of consecutive occurrences it finds. The solution then is to first remove blank lines and sort
the list for input to uniq -c
, then sort
the output on the first field, which contains the occurrences number.
Output of sed '/^\s*$/d' | sort | uniq -c | sort -k1nr
is
6 22008
4 6881
2 45682
2 56515
2 8080
Note the option to sort
at the end: -k1nr
means sort on the first field, numerically, in reverse (i.e. descending) order.
Upvotes: 1
Reputation: 113834
Numbers sorted by number of occurrences:
$ grep -vE '^$' file | sort | uniq -c | sort -rn
6 22008
4 6881
2 8080
2 56515
2 45682
grep -vE '^$' file
Remove empty lines from file
sort | uniq -c
Sort the numbers then print unique ones with a count of their occurrences.
sort -rn
Sort numerically in declining order by occurrence count.
Upvotes: 2