Reputation: 392
I want to get the value of the 11th column in my tab delimited file. This return value is multiple values concetenated using : as seperator.
example result from cat myFile | cut -d':' :
.:7:.:2:100:.
I now want to split this file on the : seperator and retrieve the second value.
This can be done with cut -d':' -f2
my question: How can I make a statement which returns all lines in my file which have value 5 or more in the second part of the 11th column?
input file (2 lines):
chr1 4396745 bnd_549 a a[chr9:136249370[ 100 PASS SVTYPE=BND;MATEID=bnd_550;EVENT=transl_inter_1022;GENE=; GT:AD:DP:SS:SSC:BQ .:.:.:.:.:. .:7:.:2:100:.
chr1 6315381 bnd_551 c ]chr9:68720182]c 100 PASS SVTYPE=BND;MATEID=bnd_552;EVENT=transl_inter_9346;GENE=; GT:AD:DP:SS:SSC:BQ .:.:.:.:.:. .:3:.:2:100:.
expected output:
chr1 4396745 bnd_549 a a[chr9:136249370[ 100 PASS SVTYPE=BND;MATEID=bnd_550;EVENT=transl_inter_1022;GENE=; GT:AD:DP:SS:SSC:BQ .:.:.:.:.:. .:7:.:2:100:.
output with (awk -F: '$11>=5' example.sorted.vcf): no output
Upvotes: 2
Views: 584
Reputation: 247042
You could also use whitespace or colon as the field separator:
awk -F ':|[[:blank:]]+' '$23 > 5' filename
Upvotes: 0
Reputation: 77145
This should work (though untested, please provide input and expected output):
awk '{split($11,ary,/:/); if(ary[2]>=5) print}' myFile
Upvotes: 4