Reputation: 3022
I am trying to filter a file_to_filter
by using another filter_file
, which is just a list of strings in $1
. I think I am close but can not seem to include the header row in the output. The file_to_filter
is tab delimited as well. Thank you :).
file_to_filter
Chr Start End Ref Alt Func.refGene Gene.refGene
chr1 160098543 160098543 G A exonic ATP1A2
chr1 172410967 172410967 G A exonic PIGC
filter_file
PIGC
desired output (header included)
Chr Start End Ref Alt Func.refGene Gene.refGene
chr1 172410967 172410967 G A exonic PIGC
awk with current output (header not included)
awk -F'\t' 'NR==1{A[$1];next}$7 in A' file test
chr1 172410967 172410967 G A exonic PIGC
Upvotes: 0
Views: 149
Reputation: 203463
Assuming your fields really are tab-separated:
awk -F'\t' 'NR==FNR{tgts[$1]; next} (FNR==1) || ($7 in tgts)' filter_file file_to_filter
To start learning awk, read the book Effective Awk Programing, 4th Edition, by Arnold Robbins.
Upvotes: 2