Tedee12345
Tedee12345

Reputation: 1362

AWK - print only duplicates

I have a file:

jeden
dwa
jeden
trzy
trzy
cztery
piec
jeden

This command prints out:

$ awk 'BEGIN {while ((getline < "file") > 0) if(a[$0]++) print }'
jeden
trzy
jeden

I want to print all duplicate:

jeden
jeden
trzy
trzy
jeden

EDIT:

I found an example that works.

awk '{if (x[$1]) { x_count[$1]++; print $0; if (x_count[$1] == 1) { print x[$1] } } x[$1] = $0}' file

I want to do the same, but with getline.

Upvotes: 2

Views: 5600

Answers (3)

Dennis Williamson
Dennis Williamson

Reputation: 360485

awk 'BEGIN {while ((getline < "file") > 0) { a[$0]++; if(a[$0] == 2) print; if (a[$0] >= 2) print }}'

When the count is two, it prints the line. When the count is greater than or equal to two, it prints the line. So for the second occurrence, the line is printed twice to "catch up".

Upvotes: 3

Kevin
Kevin

Reputation: 56129

You'll need to either store all lines in memory or take a second pass through the file. It's probably easier to do the first, and unless it's a massive file, you probably have the memory for it. You can stuff this onto one line, of course, but for ease of understanding here it is as a file.

#!/usr/bin/awk -f

{ 
        lines[NR] = $0
        counts[$0]++ 
}             

END { 
        for(i = 0; i < length(lines); i++) {
                if(counts[lines[i]] > 1) {
                        print lines[i]
                }       
        }       
}

Also, your original would be more concisely written as this:

$ awk 'a[$0]++' file

Upvotes: 1

potong
potong

Reputation: 58498

This might work for you:

awk '{a[$1]++}END{for(x in a)if(a[x]>1)for(i=1;i<=a[x];i++)print x}' file

Upvotes: 0

Related Questions