Reputation: 125
I have a text file and my goal is to generate an output file with all the words that are between two specific words.
For example, if I have this text:
askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj.
And I want to obtain all the words between "my" and "Alex".
Output:
my name is Alex
I have it in mind... but I don't know how to create the range:
if 'my' in open(out).read():
with open('results.txt', 'w') as f:
if 'Title' in open(out).read():
f.write('*')
break
I want an output file with the sentence "my name is Alex".
Upvotes: 0
Views: 7697
Reputation: 250961
You can use regex
here:
>>> import re
>>> s = "askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj."
>>> re.search(r'my.*Alex', s).group()
'my name is Alex'
If string contains multiple Alex
after my
and you want only the shortest match then use .*?
:
With ?
:
>>> s = "my name is Alex and you're Alex too."
>>> re.search(r'my.*?Alex', s).group()
'my name is Alex'
Without ?
:
>>> re.search(r'my.*Alex', s).group()
"my name is Alex and you're Alex"
Code:
with open('infile') as f1, open('outfile', 'w') as f2:
data = f1.read()
match = re.search(r'my.*Alex', data, re.DOTALL)
if match:
f2.write(match.group())
Upvotes: 2
Reputation: 239473
You can use the regular expression my.*Alex
data = "askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj"
import re
print re.search("my.*Alex", data).group()
Output
my name is Alex
Upvotes: 0