Reputation: 159
I would like a regular expression python code to:
I am no good at all with regular expressions.
Upvotes: 0
Views: 176
Reputation: 82992
Your self-answer would be better using a set
rather than that loop.
Using i
for a text variable and n
for an index is very counter-intuitive. And keywords_found
is a misnomer.
Try this:
>>> import re
>>> keywords = set(('cars', 'jewelry', 'gas'))
>>> pattern = re.compile('[a-z]+', re.IGNORECASE)
>>> txt = 'GAS, CaRs, Jewelrys'
>>> text_words = set(pattern.findall(txt.lower()))
>>> print "keywords:", keywords
keywords: set(['cars', 'gas', 'jewelry'])
>>> print "text_words:", text_words
text_words: set(['cars', 'gas', 'jewelrys'])
>>> print "text words in keywords:", text_words & keywords
text words in keywords: set(['cars', 'gas'])
>>> print "text words NOT in keywords:", text_words - (text_words & keywords)
text words NOT in keywords: set(['jewelrys'])
>>> print "keywords NOT in text words:", keywords - (text_words & keywords)
keywords NOT in text words: set(['jewelry'])
Upvotes: 0
Reputation: 159
After much google searching, and with trial an error, I a created a solution that works to separate multiple words from the input of characters.
import re
keywords = ('cars', 'jewelry', 'gas')
pattern = re.compile('[a-z]+', re.IGNORECASE)
txt = 'GAS, CaRs, Jewelrys'
keywords_found = pattern.findall(txt.lower())
n = 0
for i in keywords_found:
if i in keywords:
print keywords_found[n]
n = n + 1
Upvotes: 0
Reputation: 799082
Why bother?
>>> 'FOO'.lower() in set(('foo', 'bar', 'baz'))
True
>>> 'Quux'.lower() in set(('foo', 'bar', 'baz'))
False
Upvotes: 6