Reputation: 269
Looking for an awk (or sed) one-liner to remove lines from the output if the first field is a duplicate.
An example for removing duplicate lines I've seen is:
awk 'a !~ $0; {a=$0}'
Tried using it for a basis with no luck (I thought changing the $0's to $1's would do the trick, but didn't seem to work).
Upvotes: 14
Views: 24689
Reputation: 960
it print the unique as well as single value of the duplicates
awk '!a[$1]++' file_name
Upvotes: 0
Reputation: 3451
If you're open to using Perl:
perl -ane 'print if ! $a{$F[0]}++' file
-a
autosplits the line into the @F
array, which is indexed starting at 0
The %a
hash remembers if the first field has already been seen
This related solution assumes your field separator is a comma, rather than whitespace
perl -F, -ane 'print if ! $a{$F[0]}++' file
Upvotes: 1
Reputation: 753785
awk '{ if (a[$1]++ == 0) print $0; }' "$@"
This is a standard (very simple) use for associative arrays.
Upvotes: 26