Reputation: 93
File 1:
chr pos1 sample Gene
chr1 123 sample1 x
chr1 234 sample2 Y
chr2 345 sample2 z
File 2:
sample Gene chr pos1
sample1 x chr1 123
sample2 A chr1 234
sample2 c chr3 123
sample2 z chr2 345
I used awk 'NR==FNR{A[$1,$2]++;next}A[$3,$4]'file1 file2
to get the common result, like wise i tried
awk 'NR==FNR{A[FNR]=[$1,$2]++;next}{print A[$3,$4]==A[FNR] ? $0"\t"1 :$0"\t"0}' file1 file2
but getting error.
Upvotes: 0
Views: 248
Reputation: 26667
print A[$3,$4]==A[FNR
]`
is wrong as in the firest action you used FNR
as index and here you are using `$3,$4
`A[FNR]=[$1,$2]++;`
i didnt understand you use ++
here??
You should be using something line
awk 'NR==FNR{A[FNR]=$0;}NR!=FNR{split(A[FNR],line); if (line[1] == $3 && line[2]==$4) print $0 1; else print $0 0}' file1 file2
which will give an output as
sample Gene chr pos1 1
sample1 x chr1 123 1
sample2 A chr1 234 1
sample2 c chr3 123 0
sample2 z chr2 345 0
Here for the first file, NR==FNR
the entire line is copied to arrray A
and for the second file when NR!=FNR
the array is splited split
and checked with $3
and $4
Upvotes: 1