Reputation: 177
I have a file with multiple columns. I want to compare A1 ($4) and A2 ($14), and if the values do not match, print the value of A2 ($14). If the values match, I want to print the value of A1 ($15).
File:
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T
Desired output:
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1 noneff
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T C
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T G
I checked the difference between column 4 and 15 first.
awk '$4!=$15{print $4,$15}' file > diff
Then I tried to write the if-else statement:
awk '{if($4=$14) print $16=$14 ; else print $16=$15}' file > new_file
Upvotes: 1
Views: 954
Reputation: 2091
Try this:
awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=($4==$14)?$15:$14}1' so1186.txt
Output:
awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=($4==$14)?$15:$14}1' so1186.txt | column -t
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1 noneff
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T C
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T G
Upvotes: 3