Reputation: 4663
I'm not sure why this isn't working:
import re
import csv
def check(q, s):
match = re.search(r'%s' % q, s, re.IGNORECASE)
if match:
return True
else:
return False
tstr = []
# test strings
tstr.append('testthisisnotworking')
tstr.append('This is a TEsT')
tstr.append('This is a TEST mon!')
f = open('testwords.txt', 'rU')
reader = csv.reader(f)
for type, term, exp in reader:
for i in range(2):
if check(exp, tstr[i]):
print exp + " hit on " + tstr[i]
else:
print exp + " did NOT hit on " + tstr[i]
f.close()
testwords.txt contains this line:
blah, blah, test
So essentially 'test' is the RegEx pattern. Nothing complex, just a simple word. Here's the output:
test did NOT hit on testthisisnotworking
test hit on This is a TEsT
test hit on This is a TEST mon!
Why does it NOT hit on the first string? I also tried \s*test\s*
with no luck. Help?
Upvotes: 1
Views: 95
Reputation: 208425
Adding a print repr(exp)
to the top of the first for
loop shows that exp
is ' test'
, note the leading space.
This isn't that surprising since csv.reader()
splits on commas, try changing your code to the following:
for type, term, exp in reader:
exp = exp.strip()
for s in tstr:
if check(exp, s):
print exp + " hit on " + s
else:
print exp + " did NOT hit on " + s
Note that in addition to the strip()
call which will remove the leading a trailing whitespace, I change your second for loop to just loop directly over the strings in tstr
instead of over a range. There was actually a bug in your current code because tstr
contained three values but you only checked the first two because for i in range(2)
will only give you i=0
and i=1
.
Upvotes: 3
Reputation: 992857
The csv
module by default returns blank spaces around words in the input (this can be changed by using a different "dialect"). So exp
contains " test"
with a leading space.
A quick way to fix this would be to add:
exp = exp.strip()
after you read from the CSV file.
Upvotes: 6