Ryan
Ryan

Reputation: 1

Delete rows of a CSV file based off a column value on command line

I have a large file that I cannot open on my computer. I am trying to delete rows of information that are unneeded. My file looks like this:

NODE,107983_gene,382,666,-,cd10161,8,49,9.0E-100,49.4,0.52,domain
NODE,107985_gene,24,659,-,PF09699.9,108,148,6.3E-500,22.5,0.8571428571428571,domain
NODE,33693_gene,213,1433,-,PF01966.21,92,230,9.0E-10,38.7,0.9344262295081968,domain
NODE,33693_gene,213,1433,-,PRK04926,39,133,1.0E-8,54.5,0.19,domain
NODE,33693_gene,213,1433,-,cd00077,88,238,4.0E-6,44.3,0.86,domain
NODE,33693_gene,213,1433,-,smart00471,88,139,9.0E-7,41.9,0.42,domain
NODE,33694_gene,1430,1912,-,cd16326,67,135,4.0E-50,39.5,0.38,domain

I am trying to remove all lines that have an evalue more than 1.0E-10. This information in located in column 9. I have tried on command line:

awk '$9 >=1E-10' file name > outputfile

This has given me a smaller file but the evalues are all over the place and are not actually removing anything above 1E-10. I want small E-values only.

Does anyone have any suggestions?

Upvotes: 0

Views: 925

Answers (1)

karakfa
karakfa

Reputation: 67467

almost there, you need to specify the field delimiter

$ awk -F, '$9<1E-10' file > small.values

Upvotes: 2

Related Questions