Reputation: 3721
I'm filtering tweets in my application and want to return all tweets that either have a certain word in the text. So if I am filtering BBC and I want all instances of BBC eg. BBC, bbc, BBC1, #BBC, @bbc, how could I write the regex.
So far I'm doing:
re.compile(r'#|@[0-9]'+term, re.IGNORECASE)
Term is a list containing words and I want returned only those words in the list with the extra @ or # or 0-9 prepending or appending that word OR the word by itself.
Thanks
Upvotes: 3
Views: 523
Reputation: 179402
Use the '\b'
delimiter to find whole words:
re.compile(r'\b(?:#|@|)[0-9]*%s[0-9]*\b' % re.escape(term), re.IGNORECASE)
Upvotes: 2