Reputation: 553
I have a requirement to print all the duplicated lines in a file where in uniq -D
option did not support. So I am thinking of an alternative way to print the duplicate lines using awk. I know that, we have an option in awk like below.
testfile.txt
apple
apple
orange
orange
cherry
cherry
kiwi
strawberry
strawberry
papaya
cashew
cashew
pista
The command:
awk 'seen[$0]++' testfile.txt
But the above does print only the unique duplicate lines. I need the same output that uniq -D
command retrieves like this.
apple
apple
orange
orange
cherry
cherry
strawberry
strawberry
cashew
cashew
Upvotes: 11
Views: 19423
Reputation: 58528
This might work for you (GNU sed):
sed -rn ':a;N;/^([^\n]*)\n\1$/p;//ba;/^([^\n]*)(\n\1)+$/P;//ba;s/.*\n//;ba' file
Read two lines into the pattern space (PS). If the first two lines are duplicate, print them and loop back and read a third line. If the third or subsequent lines are duplicate, print the first and loop back and read another line. Otherwise, remove all but the last line and loop back and read another etc.
Upvotes: 1
Reputation: 52451
With sed:
$ sed 'N;/^\(.*\)\n\1$/p;$d;D' testfile.txt
apple
apple
orange
orange
cherry
cherry
strawberry
strawberry
cashew
cashew
This does the following:
N # Append next line to pattern space
/^\(.*\)\n\1$/p # Print if lines in pattern space are identical
$d # Avoid printing lone non-duplicate last line
D # Delete first line in pattern space
There are a few limitations:
It only works for contiguous duplicates, i.e., not for
apple
orange
apple
Lines appearing more than twice in a row throw it off.
Upvotes: 1
Reputation: 204446
No need to parse the file twice:
$ awk 'c[$0]++; c[$0]==2' file
apple
apple
orange
orange
cherry
cherry
strawberry
strawberry
cashew
cashew
Upvotes: 17
Reputation: 247092
If you want to stick with just plain awk, you'll have to process the file twice: once to generate the counts, once to eliminate the lines with count equal 1:
awk 'NR==FNR {count[$0]++; next} count[$0]>1' testfile.txt testfile.txt
Upvotes: 7
Reputation: 8406
awk '{if (x[$1]) { x_count[$1]++; print $0; if (x_count[$1] == 1) { print x[$1] } } x[$1] = $0}' testfile.txt
Upvotes: 0
Reputation: 10209
Something like this, if uniq supports -d
?
grep -f <(uniq -d testfile.txt ) testfile.txt
Upvotes: 0