Reputation: 367
I have the following string:
The quick brown fox, the cat in the (hat) and the dog in the pound. The Cat in THE (hat):
I need help with extracting the following text:
1) the cat in the (hat)
2) The Cat in THE (hat)
I have tried the following:
p1 = """The quick brown fox, the cat in the (hat) and the dog in the pound. The Cat in THE (hat)"""
pattern = r'\b{var}\b'.format(var = p1)
with io.open(os.path.join(directory,file), 'r', encoding='utf-8') as textfile:
for line in textfile:
result = re.findall(pattern, line)
print (result)
Upvotes: 1
Views: 246
Reputation: 3288
Strictly matching that string, you can use this regex. To generalize for the future, the (?i)
in the beginning makes it ignore the case and use \
to escape the parentheses.
import re
regex = re.compile('(?i)the cat in the \(hat\)')
string = 'The quick brown fox, the cat in the (hat) and the dog in the pound. The Cat in THE (hat):'
regex.findall(string)
Result:
['the cat in the (hat)', 'The Cat in THE (hat)']
Upvotes: 4