Reputation: 69
I have 2 files for example :
file 1:
1 azer 4
2 toto 0
3 blabla 8
4 riri 9
5 coco 2
file 2:
1 azer 4
2 toto 0
3 blabla 8
I want to compare the two files, and if the lines in the file 2 are in the file 1, I want to remove those lines from the file 1. For example :
Output:
4 riri 9
5 coco 2
I tried this command but it show me only the similarities :
awk 'NR==FNR{a[$2];next} $1 in a {print $0}' merge genotype.txt
Does any one know how to do this ? I tried it in awk but if it's possible to do this in R or python it's good too.
Upvotes: 1
Views: 1107
Reputation: 1770
A much simpler solution in grep
-
$cat file1
1 azer 4
2 toto 0
3 blabla 8
4 riri 9
5 coco 2
$cat file2
1 azer 4
2 toto 0
3 blabla 8
Try-
grep -vf file2 file1
Output-
4 riri 9
5 coco 2
Upvotes: 2
Reputation: 16997
# awk
awk 'FNR==NR{a[$0];next}!($0 in a)' file2 file1
# comm
comm -23 file1 file2
# grep
grep -Fvxf file2 file1
Input
$ cat file1
1 azer 4
2 toto 0
3 blabla 8
4 riri 9
5 coco 2
$ cat file2
1 azer 4
2 toto 0
3 blabla 8
Output
$ awk 'FNR==NR{a[$0];next}!($0 in a)' file2 file1
4 riri 9
5 coco 2
$ comm -23 file1 file2
4 riri 9
5 coco 2
$ grep -Fvxf file2 file1
4 riri 9
5 coco 2
Upvotes: 1
Reputation: 140168
First, read file 2 lines as a set
so testing is faster. Then iterate through lines of file 1 and write output file lines using a generator comprehension.
with open("file2.txt") as f: file2 = set(f)
with open("file1.txt") as fr, open("file3.txt","w") as fw:
fw.writelines(l for l in fr if l not in file2)
Upvotes: 1