Reputation:
I have a .txt file with data in the following format:
pq1000007 35 2 237493054 0.013328573
I am trying to use regex that will capture the first, third, and last number, but only if the last number is greater than .4. For some reason, I get the message that "NoneType object has no attribute 'group'". Any ideas?
Code:
InFileName = "PerkQP_CHGV_SCZ.txt"
InFile = open(InFileName, 'r')
OutFileName='PAZ_OUT' + ".txt"
OutFile=open(OutFileName, 'w')
for Line in InFile:
match = re.search('(\w+)\s\d+\s(\d+)\s\d+\d+\s(\d+\.\d+)', Line)
if match.group(2) > 0.4:
c = match.group()
print(c)
OutFile.write(c+"\n")
InFile.close()
OutFile.close()
Upvotes: 1
Views: 223
Reputation: 336378
A few problems:
A regex match is a string, so you can't meaningfully compare it with a float (in fact, in Python 3, it's illegal to do so). In Python 2, any string will always compare greater than a float (because "str"
in ASCII is higher than "float"
. Yes, this rule makes no sense. Good that Python 3 did away with it).
Then, the last number in that regex is in the third capturing group, so you'd need to do
if float(match.group(3)) > 0.4:
Then, you should use a verbatim string (r"..."
) with your regex.
Finally, \d+\d+
is of course redundant, \d+
will do.
match = re.search(r'(\w+)\s\d+\s(\d+)\s\d+\s(\d+\.\d+)', Line)
This regex matches the example line you gave it, so your error message (which indicates a non-match) must have a different origin. Perhaps there is a line somewhere in your file that does not match the regex. In that case, you could structure your program like this:
for Line in InFile:
match = re.search(r'(\w+)\s\d+\s(\d+)\s\d+\s(\d+\.\d+)', Line)
if match:
if float(match.group(3)) > 0.4:
# do stuff
else:
print "No match: ", Line
Upvotes: 1
Reputation: 251428
If the result of the search is None, that means your regex is not matching. It seems to work for the example you give, but perhaps your actual data in the file doesn't match the pattern. (Also, I see that your regex contains \d+\d+
which should just be \d+
.)
In addition the match.group
returns a string. You need to convert that to a number (with e.g., float(match.group(2))
to compare it to the number 0.4.
Upvotes: 1