kllrdr
kllrdr

Reputation: 177

Comparing two columns: if they match, print the value in a new column and if they do not match print the value of the second column to the new column

I have a file with multiple columns. I want to compare A1 ($4) and A2 ($14), and if the values do not match, print the value of A2 ($14). If the values match, I want to print the value of A1 ($15).

File:

chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T

Desired output:

chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1 noneff
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T C
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T G

I checked the difference between column 4 and 15 first.

awk '$4!=$15{print $4,$15}' file > diff

Then I tried to write the if-else statement:

awk '{if($4=$14) print $16=$14 ; else print $16=$15}' file > new_file

Upvotes: 1

Views: 954

Answers (2)

Joaquin
Joaquin

Reputation: 2091

Try this:

awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=($4==$14)?$15:$14}1' so1186.txt

Output:

awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=($4==$14)?$15:$14}1' so1186.txt | column -t
chr  SNP          BP     A1  TEST  N       OR      Z       P       chr  SNP          cm  BP     A2  A1  noneff
20   rs6078030    61098  T   ADD   421838  0.9945  -0.209  0.8344  20   rs6078030    0   61098  C   T   C
20   rs143291093  61270  G   ADD   422879  1.046   0.5966  0.5508  20   rs143291093  0   61270  G   A   A
20   rs4814683    61795  T   ADD   417687  1.015   0.6357  0.525   20   rs4814683    0   61795  G   T   G

Upvotes: 3

vgersh99
vgersh99

Reputation: 965

awk '{$(++NF)=($4==$15)?$4:$15}1' file

Upvotes: 2

Related Questions