user10676353
user10676353

Reputation: 23

Find duplicate record in csv by shell script(Ubuntu)

I have below csv

name,mobile
name1,123456
name2,98765
name1,123456
name3,98765
name1,123456
name4,344545443

If two record has mobile then that record will be considered as duplicate . But while printing the duplicate record first record has to ignore

So my output should be like this

name,mobile
name1,123456
name1,123456
name2,98765

So here 123456 is 3 times in my file but I only want to print it two time for me first occurrence is unique and all other occurrence is duplicate.

I have tried

awk -F, 'NR==FNR {++A[$2]; next} A[$2]>1'  file1.csv file1.csv

It gives me

name1,123456
name2,98765
name1,123456
name3,98765
name1,123456

it's not ignoring the first occurrence

Please help me on this

Upvotes: 2

Views: 272

Answers (1)

glenn jackman
glenn jackman

Reputation: 247202

As I understand your question, you want to output records where the 2nd field occurs at least twice, but do not output the first instance.

awk -F, '++seen[$2] > 1' file

Given your sample data, this prints

name1,123456
name3,98765
name1,123456

This is lines 4,5,6 from the input data.

Upvotes: 3

Related Questions