Reputation: 3806
I'm trying to compute some stuff in awk, and at the end print the result in the order of the input. For each line, I check if it has not been already seen. If not, I add it to the array and also store it in an order
array.
{
if (! $0 in seen) {
seen[$0] = 1
order[o++] = $0
}
} END {
for (i=0; i<o; i++)
printf "%s\n", order[i]
}
You can try it with
printf 'a\nb\na\nc\nb\na\n' | awk script_above
It prints nothing. If I print the variable o
at the end, it shows that its value is still 0. What am I doing wrong?
Upvotes: 1
Views: 3837
Reputation: 10865
You just need to add parens to get the right operator precedence*:
# a.awk
{
if (!($0 in seen)) {
seen[$0] = 1
order[o++] = $0
}
}
END {
for (i=0; i<o; i++)
printf "%s\n", order[i]
}
Test:
$ awk -f a.awk file
a
b
c
* (The unary !
binds more tightly than the in
operator: https://www.gnu.org/software/gawk/manual/html_node/Precedence.html)
Upvotes: 3
Reputation: 133610
What you are trying to do is in Shell way, awk
has a way where you could keep checking if an element is part of an array or not, try following once.
printf 'a\nb\na\nc\nb\na\n' | awk '
!seen[$0]++ {
order[o++] = $0
}
END {
for (i=0; i<o; i++)
printf "%s\n", order[i]
}'
Here !seen[$0]++
means it is checking condition if an element is NOT a part of indexes of array named a
then go inside the BLOCK(where your next statements are provided) then it does ++
which makes sure that this element(which was NOT there in array before checking condition)'s counter incremented by 1 so that next time this !seen[$0]++` condition is NOT TRUE for the already passed element.
Upvotes: 3