Reputation: 1
I have a large file that I cannot open on my computer. I am trying to delete rows of information that are unneeded. My file looks like this:
NODE,107983_gene,382,666,-,cd10161,8,49,9.0E-100,49.4,0.52,domain
NODE,107985_gene,24,659,-,PF09699.9,108,148,6.3E-500,22.5,0.8571428571428571,domain
NODE,33693_gene,213,1433,-,PF01966.21,92,230,9.0E-10,38.7,0.9344262295081968,domain
NODE,33693_gene,213,1433,-,PRK04926,39,133,1.0E-8,54.5,0.19,domain
NODE,33693_gene,213,1433,-,cd00077,88,238,4.0E-6,44.3,0.86,domain
NODE,33693_gene,213,1433,-,smart00471,88,139,9.0E-7,41.9,0.42,domain
NODE,33694_gene,1430,1912,-,cd16326,67,135,4.0E-50,39.5,0.38,domain
I am trying to remove all lines that have an evalue more than 1.0E-10. This information in located in column 9. I have tried on command line:
awk '$9 >=1E-10' file name > outputfile
This has given me a smaller file but the evalues are all over the place and are not actually removing anything above 1E-10. I want small E-values only.
Does anyone have any suggestions?
Upvotes: 0
Views: 925
Reputation: 67467
almost there, you need to specify the field delimiter
$ awk -F, '$9<1E-10' file > small.values
Upvotes: 2