Reputation: 3
I'm trying to filter a large tab delimited file and print out just the lines with a score of >0.999 in one of the columns, but for some reason script's output continues to just print every line. Any insights as to why my "if score > 0.999:" isn't working as intended?
import sys
import string
import re
def split_lines(lines):
for line in lines:
if line.find('#') >-1:
print line
else:
#pass
#fields = re.split('\t',line)
fields = line.split('\t')
score = fields[3]
if score > 0.999:
print score
#else:
# pass
data = sys.stdin.read()
lines = data.split('\n')
split_lines(lines)
Upvotes: 0
Views: 48
Reputation: 12670
You need to convert the string score to a number format, Decimal or float
if float(score) > 0.999
Upvotes: 3