Aarav
Aarav

Reputation: 59

Printing duplicate rows as many times it is duplicate in the input file using UNIX

Suppose I have a sorted file:

 AARAV,12345,BANK OF AMERICA,$145
 AARAV,12345,BANK OF AMERICA,$145
 AARAV,12345,BANK OF AMERICA,$145
 RAM,124455,DUETCHE BANK,$240

And I want output as:

 AARAV,12345,BANK OF AMERICA,$145
 AARAV,12345,BANK OF AMERICA,$145

With **uniq -d file** I am able to find duplicate records but its printing the record only once even if it is repeated. I want to print as many times it is duplicated. Thanks in advance.

Upvotes: 1

Views: 210

Answers (2)

merlin2011
merlin2011

Reputation: 75639

The following should do what you want, assuming your file is called Input.txt.

uniq -d Input.txt  | xargs -I {} grep   {} Input.txt

xargs -I {} basically tells xargs to substitute the input that is being piped in whenever it sees {} in a later command.

grep {} Input.txt will be called with each line of input from the pipe, where the line of input will get substituted where {} is.

Why does this work? We are using uniq -d to find the duplicate entries, and then using them as input patterns to grep to match all the lines which contain those entries. Thus, only duplicate entries are printed, and they are printed exactly as many times as they appear in the file.

Update: Printing the duplicates occurences only, not the first occurrence, in a way that is compatible with ksh, since the OP does not apparently have bash on his system.

uniq -d Input.txt | xargs -L 1 | while read line
do
    grep  "$line"  Input.txt | tail -n +2; 
done

Note that in the above scripts, we are assuming that no line is a substring of another line.

Upvotes: 1

Bradley Snyder
Bradley Snyder

Reputation: 228

This should give you the output that you want. It repeats each duplicate line N-1 times. Unfortunately the output isn't sorted, so you'd have to pipe it through sort again.

Assuming the input file is input.txt:

awk -F '\n' '{ a[$1]++ } END { for (b in a) { while(--a[b]) { print b } } }' input.txt | sort

Upvotes: 0

Related Questions