Reputation: 11
I've got a file which looks like this :
1 a
3 b
2 b
9 a
0 a
5 c
8 b
I'd like...
... all this in a single awk program.
So the final output would be something like :
x
0
a
x
8
b
y
5
c
I succed doing all this, but using two awk programs and one external command :
awk -F '\t' '{
value[$2]=$2"\t"$1 }
END { for (i in value) print value[i]
}' | \
sort -dfb | \
awk -F '\t' '{
if ($1 == "a" || $1=="b") print "x\n"$2"\n"$1
if ($1 == "c") print "y\n"$2"\n"$1
}'
A simpler way to do this would be to sort the arrays of the first awk program by alphabetical order. This would permit to merge the content of the second awk program in the first. However, I've no idea how I can do this. Any idea ?
Upvotes: 0
Views: 2142
Reputation: 1
This is six years ago, and here I am replying... if I understand the request the list of values are:
1 a
3 b
2 b
9 a
0 a
5 c
8 b
Is to be processed for only 1 instance of column 2, with the lowest associated value of column 1. The desired result:
0 a
2 b
5 c
The process seemed to be simplest by using 2 sorts instead of awk. Capturing the list of values in FILE, the following commands would present the results:
$ sort +0 -1n FILE|sort +1 -2 -u
0 a
2 b
5 c
The reverse order or highest column 1 value for each unique column 2
$ sort +0 -1nr FILE|sort +1 -2 -u
9 a
8 b
5 c
If awk is preferred over the sort, then the following awk program can perform the action to take the smallest value for each unique column 2 entry:
$ awk '{if($2 in COL2){if(COL2[$2]>$1){COL2[$2]=$1}}else{COL2[$2]=$1}}END{for(I in COL2){print COL2[I],I}}' FILE
0 a
2 b
5 c
The reverse order, the highest value of column 1 for each unique column 2 entry is accomplished by replacing ">" with "<":
$ awk '{if($2 in COL2){if(COL2[$2]<$1){COL2[$2]=$1}}else{COL2[$2]=$1}}END{for(I in COL2){print COL2[I],I}}' FILE
9 a
8 b
5 c
Possibly I missed the requirements, and 6 years later is not a very timely response. I was looking for something else, and found this and couldn't help myself.
Upvotes: 0
Reputation: 28000
GNU awk <= 3:
WHINY_USERS= awk 'END {
for (R in r)
printf "%s\n%s\n%s\n",
(R ~ /^[ab]$/ ? "x" : "y" ), r[R], R
}
{
r[$2] = $1
}' infile
GNU awk >= 4:
awk 'END {
PROCINFO["sorted_in"] = "@ind_str_asc"
for (R in r)
printf "%s\n%s\n%s\n",
(R ~ /^[ab]$/ ? "x" : "y" ), r[R], R
}
{
r[$2] = $1
}' infile
Upvotes: 1