Reputation: 3
I actually need to grep the entire line. I have a file with a bunch of lines that look like this
1 123213 A T . stuff=1.232;otherstuf=34;morestuff=121;AF=0.44;laststuff=AV
4 223152 D L . stuff=1.122;otherstuf=4;morestuff=41;AF=0.02;laststuff=RV
and I want to keep all the lines where AF>0.1. So for the lines above I only want to keep the first line.
Upvotes: 0
Views: 334
Reputation: 203324
$ awk -F= '$5>0.1' file
1 123213 A T . stuff=1.232;otherstuf=34;morestuff=121;AF=0.44;laststuff=AV
If that doesn't do what you want when run against your real data then edit your question to provide more truly representative sample input/output.
Upvotes: 1
Reputation: 785058
Using gnu-awk you can do this:
awk 'gensub(/.*;AF=([^;]+).*/, "\\1", "1", $NF)+0 > 0.1' file
1 123213 A T . stuff=1.232;otherstuf=34;morestuff=121;AF=0.44;laststuff=AV
This gensub
function parses out AF=<number>
from last field of the input and captures number in captured group #1 which is used for comparison with 0.1
.
PS: +0
will convert parsed field to a number.
Upvotes: 2
Reputation: 4963
You could use awk
with multiple delimeters to extract the value and compare it:
$ awk -F';|=' '$8 > 0.1' file
Upvotes: 1
Reputation: 157967
I would use awk
. Since awk
supports alphanumerical comparisons you can simply use this:
awk -F';' '$(NF-1) > "AF=0.1"' file.txt
-F';'
splits the line into fields by ;
. $(NF-1)
address the second last field in the line. (NF
is the number of fields)
Upvotes: 0
Reputation: 48804
Assuming that AF
is always of the form 0.NN
you can simply match values where the tens place is 1-9, e.g.:
grep ';AF=0.[1-9][0-9];' your_file.csv
You could add a +
after the second character group to support additional digits (i.e. 0.NNNNN
) but if the values could be outside the range [0, 1) you shouldn't try to match the field with regular expressions.
Upvotes: 1