Reputation: 346
Why doesn't this work? I've looked for so long and have found some pretty complex solutions, but I'm thinking this can be simplified and reused...sad :'(
Statement
awk -F"\t" '!seen[$3]++'
File
r1c1 r1c2 r1c3
r2c1 r2c2 r2c3
r3c1 r3c2 r3c3
r4c1 r4c2 r3c3
r5c1 r5c2 r5c3
Desired Output
r3c1 r3c2 r3c3
r4c1 r4c2 r3c3
Code adds a 0 and 1.
[user@host]$ awk '{a[$3]=a[$3] $0 RS c[$3]++} END {for (i in c) if (c[i]>1) printf "%s",a[i]}' file
r3c1 r3c2 r3c3
0r4c1 r4c2 r3c3
1[jcole@dukescri01 srlg]$
Upvotes: 3
Views: 2825
Reputation: 133590
Following awk
version may also help you on same(In case you want to get the same order of output as per Input_file itself).
awk 'FNR==NR{a[$3]++;next} a[$3]>1' Input_file Input_file
EDIT:
awk '{++a[$3];b[$3]=b[$3]?b[$3] ORS $0:$0}END{for(i in a){if(a[i]>1){print b[i]}}}' Input_file
Upvotes: 2
Reputation: 92854
Simply with uniq
command:
uniq -f2 -D file
-f N
- avoid comparing the first N
fields-D
- print all duplicate linesThe output:
r3c1 r3c2 r3c3
r4c1 r4c2 r3c3
In case if the file is unsorted:
sort -k3 file | uniq -f 2 -D
Upvotes: 2
Reputation: 37414
In awk, one-pass version that stores records to hash:
$ awk '
{
a[$3]=a[$3] $0 RS # store records
c[$3]++ # counter
}
END {
for(i in c)
if(c[i]>1) # pick the ones with duplicates
printf "%s",a[i]
}' file
r3c1 r3c2 r3c3
r4c1 r4c2 r3c3
Upvotes: 4