Swimming bird
Swimming bird

Reputation: 77

Find repeated pairs comparing two files

I have two files:

File 1:

A B
C D
F A
C G

File 2:

A G
C D
A C
D C
F A

What I want is to find all the pairs of words repeated in file 2 when comparing to file 1 and after remove those from file 2 and join the two files. In this case, the repeated pairs are:

C D
D C
F A

Note that I don't want the same pairs in reverse order. Any word can appear multiple times in the two files.

I tried this but it's not efficient and requires an extra step to remove the repetitions from file 2:

cat file1 | while read f1 f2; do grep "$f1 $f2\|$f2 $f1" file2; done > redundancies.txt

grep -vf redundancies.txt file2 > file2b

Upvotes: 0

Views: 42

Answers (1)

John Kugelman
John Kugelman

Reputation: 361927

$ grep -vFf f1 f2
A G
A C
D C

This reads file 2 and removes any lines that are also present in file 1. To handle the words being in either order you can replace f1 with a process substitution that prints the file with both word orderings.

$ grep -vFf <(cat f1; awk '{print $2,$1}' f1) f2
A G
A C

Upvotes: 1

Related Questions