Reputation: 1185
I am just testing out a small python script of which I will use part in a larger script. Basically I am trying to lookup a field in a CSV file (where it contains a regex), and use this in a regex test. The reason is (part of a very wierd use-case) and will enable easier maintenance of a CSV file instead of the script. Is there something I am missing with the following....
test.csv:
field0,field1,field2
foo,bar,"\d+\.\d+"
bar,foo,"\w+"
test.py (extra print
's used for testing):
import sys
import re
import csv
input = sys.argv[1]
print input
reader = csv.reader(open('test.csv','rb'), delimiter=',', quotechar="\"")
for row in reader:
print row
value = row[0]
print value
if value in input:
regex = row[2]
print regex
pat = re.compile(regex)
test = re.match(pat,input)
out = test.group(1)
print out
If I pass a value like "foo blah 38902462986.328946239846
" to the script, I would expect this to pick up that it contains foo
and then use the regex, \d+\.\d+
, to extract 38902462986.328946239846
. However when I run the script I get the following:
foo blah 0920390239.90239029
['field0', 'field1', 'field2']
field0
['foo', 'bar', '\\d+\\.\\d+']
foo
\d+\.\d+
Traceback (most recent call last):
File "reg.py", line 19, in <module>
out = test.group(1)
AttributeError: 'NoneType' object has no attribute 'group'
Not sure what's going on really.
P.S Python is a big world and still learning.
Upvotes: 0
Views: 1475
Reputation: 15299
According to the docs re.match
matches at the beginning of the input string. You need to use re.search
. Also, there's no need to compile if you don't reuse them afterwards. Just say test = re.search(regex, input)
.
In the regular expressions in your example you don't have any capture groups, so test.group(1)
is going to fail, even if there's a match in the input
.
import sys
import re
import csv
input = 'foo blah 38902462986.328946239846'
reader = csv.reader(open('test.csv','rb'), delimiter=',', quotechar="\"")
for row in reader:
value = row[0]
if value in input:
regex = row[2]
test = re.search(regex, input)
print input[test.start():test.end()]
Prints:
38902462986.328946239846
Upvotes: 1