Erika
Erika

Reputation: 69

Compare 2 files and extract lines that are different

I have 2 files for example :

file 1:

1 azer 4
2 toto 0
3 blabla 8
4 riri 9
5 coco 2

file 2:

1 azer 4
2 toto 0
3 blabla 8

I want to compare the two files, and if the lines in the file 2 are in the file 1, I want to remove those lines from the file 1. For example :

Output:

4 riri 9
5 coco 2

I tried this command but it show me only the similarities :

awk 'NR==FNR{a[$2];next} $1 in a {print $0}' merge genotype.txt

Does any one know how to do this ? I tried it in awk but if it's possible to do this in R or python it's good too.

Upvotes: 1

Views: 1107

Answers (3)

Chem-man17
Chem-man17

Reputation: 1770

A much simpler solution in grep-

$cat file1
1 azer 4
2 toto 0
3 blabla 8
4 riri 9
5 coco 2

$cat file2
1 azer 4
2 toto 0
3 blabla 8

Try-

grep -vf file2 file1

Output-

4 riri 9
5 coco 2

Upvotes: 2

Akshay Hegde
Akshay Hegde

Reputation: 16997

# awk
awk 'FNR==NR{a[$0];next}!($0 in a)' file2 file1

# comm
comm -23 file1 file2

# grep 
grep -Fvxf file2 file1

Input

$ cat file1
1 azer 4
2 toto 0
3 blabla 8
4 riri 9
5 coco 2

$ cat file2
1 azer 4
2 toto 0
3 blabla 8

Output

$ awk 'FNR==NR{a[$0];next}!($0 in a)' file2 file1
4 riri 9
5 coco 2

$ comm -23 file1 file2
4 riri 9
5 coco 2

$ grep -Fvxf file2 file1
4 riri 9
5 coco 2

Upvotes: 1

Jean-François Fabre
Jean-François Fabre

Reputation: 140168

First, read file 2 lines as a set so testing is faster. Then iterate through lines of file 1 and write output file lines using a generator comprehension.

with open("file2.txt") as f: file2 = set(f)

with open("file1.txt") as fr, open("file3.txt","w") as fw:
    fw.writelines(l for l in fr if l not in file2)
  • order preserved
  • fast testing
  • file 1 is never read fully in memory, but the chain of iterators read/write the files line by line

Upvotes: 1

Related Questions