Vignesh Jei
Vignesh Jei

Reputation: 23

Compare 2 files and remove duplicate lines only once

I need to remove the duplicate values from file 1 comparing with file 2 . When i was trying to do so , i am facing issue like since the value in file 2(c,g) also comes under [b] in file 1 , those are also getting deleted. but my requirement is to delete only those under [a]. Thanks

$ less file 1
[a]
c
g
d
[b]
c
g
h

and

$ less file 2
[a]
c
g
d

Upvotes: 0

Views: 784

Answers (1)

anubhava
anubhava

Reputation: 784958

You can use this awk command:

awk '/^\[.*?\]/{s=$0} FNR==NR{seen[s,$0]++; next} !seen[s,$0]' file2 file1
[b]
c
g
h

This awk is using an associative array seen with a composite key of value inside [...] and all the subsequent records i.e. s,$0

While going through file2 it saves those value in array and while traversing through file1 it will print only those that aren't available in seen thus avoiding duplicates.

Upvotes: 3

Related Questions