Reputation: 25
There is a text file:
$ cat tempfile.txt
123
567
345
123
789
234
123
234
345
789
and my desired output is the following:
123,3
789,2
345,2
I need to sort each (1) in order of the number of occurrences, and (2) when the numbers of the occurrences are the same, the one with bigger numerical value should be ordered first, and (3) only top 3 are shown.
I tried this:
tr -c '[:alnum:]' '[\n*]' < tempfile.txt | sort -nr | uniq -c | sort -nr | head -3
But this only shows like this: I need to swap the position of the number of occurrences and digits, and separate those two by comma, and when the number of the occurrence is the same sort the bigger numerical digits first.
3 123
2 234
2 345
Upvotes: 2
Views: 472
Reputation: 103884
You can use regular awk and two field sort this way:
$ awk '{arr[$1]++}
END{for (e in arr) print e "," arr[e]}' file |
sort -t , -k2rn -k1rn
Prints:
123,3
789,2
345,2
234,2
567,1
Again, use head
to get the desired number of the values.
Upvotes: 1
Reputation: 1938
take this line :
sort tempfile.txt | uniq -c | sort -nr -k 1,1 -k 2,2 | awk '{print $2","$1; if (NR == 3) exit}'
it gives:
123,3
789,2
345,2
Upvotes: 3
Reputation: 12347
Use this command, which allows specific control of the sorting order:
sort -g tempfile.txt | uniq -c | perl -lane 'print join "\t", reverse @F;' | sort -k2,2gr -k1,1g | head -n3 | tr '\t' ','
Output:
123,3
234,2
345,2
Upvotes: 0
Reputation: 133545
Within single GNU awk
, could you please try following.
awk '
{
a[$0]++
}
END{
PROCINFO["sorted_in"] = "@val_num_desc"
for(i in a){
c[a[i]]=(c[a[i]]>i?c[a[i]]:i)
}
PROCINFO["sorted_in"] = "@ind_num_desc"
for(o in c){
print o,c[o]
}
}' Input_file
For provided samples output will be as follows.
3 123
2 789
1 567
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
{
a[$0]++ ##Creating array a with index of current line and increasing its value with 1 here.
}
END{ ##Starting END block of this program from here.
PROCINFO["sorted_in"] = "@val_num_desc" ##Making gwk to sort from value in array gawk GREAT function :)
for(i in a){ ##Traversing through a here.
c[a[i]]=(c[a[i]]>i?c[a[i]]:i) ##Creating array c with index of value of a with condition if
##value of c(with index of a[i]) is greater than i then keep its value else assign i to it.
}
PROCINFO["sorted_in"] = "@ind_num_desc" ##Making gawk to sort by index in array gawk another GREAT function :)
for(o in c){ ##Traversing through c here.
print o,c[o] ##Printing index and value of c with index of o here.
}
}' Input_file ##Mentioning Input_file name here.
Upvotes: 0