Reputation: 3022
The below awk
seems to work great with 1
issue, the header lines do hot print in the output? I have been staring at this awhile with no luck. What am I missing? Thank you :).
awk
awk 'NR==FNR{for (i=1;i<=NF;i++) a[$i];next} FNR==1 || ($7 in a)' /home/panels/file1 test.txt |
awk '{split($2,a,"-"); print a[1] "\t" $0}' |
sort |
cut -f2-> /home/panels/test_filtered.vcf
test.txt (used in the awk
to give the filtered output --only a small portion of the data but the tab delimited format is shown)
Chr Start End Ref Alt
chr1 949608 949608 G A
current output (has no header)
chr1 949608 949608 G A
desired output (has header)
Chr Start End Ref Alt
chr1 949608 949608 G A
Upvotes: 1
Views: 357
Reputation: 67507
you can combine your scripts and add the sort into awk
and handle header this way.
$ awk 'NR==FNR{for(i=1;i<=NF;i++)a[$i]; next}
FNR==1{print "dummy\t" $0; next}
$7 in a{split($2,b,"-"); print b[1] "\t" $0 | "sort" }' file1 file2 |
cut -f2
Upvotes: 0
Reputation: 212404
It looks like the header is going to sort, and getting mixed in with your data. A simple solution is to do:
... | { read line; echo $line; sort; } |
to prevent the first line from going to sort.
Upvotes: 2