Reputation: 3382
I have question about printing of rows, which have same 4th column, but different 1st column.
INPUT:
156817 GJB2 HET 882745
156817 ASPA HET 882745
156817 HFE HET 882745
156917 ABCA4 HET 882745
156917 MEFV HET 882745
156917 HFE HET 882745
228417 GJB2 HET 883590
228417 BTD HET 883590
228417 MCCC1 HET 883590
OUTPUT:
156817 HFE HET 882745 156917 HFE HET 882745
For understand: I would like to get results only for 1st columns, which are different, but have same 4th column and have same 2nd column and print it to one row. So for this example. There is Same 4th column (882745), but different 1st column (156817 and 156917) and have same 2nd column (HFE). This is really hard for me to do that. I tried so many ways, but I can get the result. Thank you
What I did try:
awk -F'\t' -v OFS="\t" '{prev=$0; f1=$2; f2=$4; f3=$1
getline
if ($2 == f1 && $4 == f2 && $1!= f3 ) {
print prev
print }
}' file
But it doesn't work..
Upvotes: 2
Views: 34
Reputation: 92894
awk solution:
awk -F'\t' '{ k=$2 SUBSEP $3 SUBSEP $4 }
{ if((k in a) && $1!=a[k]){ printf "%s\t%s\t%s\t%s\t%s ", a[k],$2,$3,$4,$0 }
else a[k]=$1 }END{ print "" }' file
The output:
156817 HFE HET 882745 156917 HFE HET 882745
Upvotes: 3