Reputation: 113
I am looking to print columns 1 & 2 where column 1 matches and column 3 is the highest value. I am currently using awk and sort to get this type of output:
EXCEPTION 91 3
EXCEPTION 15 5
TEST 5 1
TEST 1 8
the end desired output I am looking for:
EXCEPTION 15
TEST 1
Here is a file example and the commands I am running to get the uniq counts. What I would really like is for sort to print the last record in the uniq sort
EXCEPTION 15
so I don't have to all the crazy uniq --count logic. IE: I want to know if column1 matches >= 3times and print the last recorded column two value for that match.
cat /tmp/testing.txt |grep EXCEPTION
EXCEPTION 15
EXCEPTION 15
EXCEPTION 15
EXCEPTION 91
EXCEPTION 91
EXCEPTION 91
EXCEPTION 91
EXCEPTION 15
EXCEPTION 15
cat /tmp/testing.txt|awk '{print $1 " " $2}'|sed '/^$/d'| awk '$2 >= '1' '|sort |uniq --count|awk '{print $1" "$2" "$3}'|awk '$1 >= '3''|awk '{print $1" "$2" "$3}'|awk '{print $2" "$3" "$1}'
EXCEPTION 15 5
EXCEPTION 91 4
Upvotes: 0
Views: 1097
Reputation: 729
You said you didn't want horrible uniq
logic... but in case you change your mind, this task does fit quite neatly into sort/uniq/cut's purview (though this isn't as efficient as the awk solution).
From your testing file you can get your desired output with
sort -k1,2n < testing |
uniq -c |
sort -k2,2 -k1rn,1 |
cut -c8- |
sort -u -k1,1
In order: sort by first column alphabetically (default) and then second column numerically - this puts identical lines in sequence.
Then count the occurences of each line, prepend 8 characters to each line containing the count along with whitespaces.
Sort by count descending, after sorting by the string (EXCEPTION, TEST) which is now the second field.
Remove the first 8 characters from each line (the counts).
Finally "sort" by the string and only output uniques. As the record you're interested in has been sorted to the top, this is the one it outputs. This can be thought of as "uniq
by field."
(If you want to remove the trailing spaces from your input you can replace the cut
command with sed 's/^ *[0-9]\+ *//'
)
Upvotes: 0
Reputation: 289955
Just keep track of the maximums for any given 1st field and store its corresponding 2nd field:
awk '{if ($3>max[$1])
{max[$1]=$3; val[$1]=$2}
}
END {for (i in val) print i, val[i]}' file
$ awk '{if ($3>max[$1]) {max[$1]=$3; val[$1]=$2}} END {for (i in val) print i, val[i]}' file
EXCEPTION 15
TEST 1
Upvotes: 1