Reputation: 229
I am attempting to search for text between two keywords. My solution so far is using split()
to change string to list. It works but I was wondering if there is more efficient/elegant way to achieve this. Below is my code:
words = "Your meeting with Dr Green at 8pm"
list_words = words.split()
before = "with"
after = "at"
title = list_words[list_words.index(before) + 1]
name = list_words[list_words.index(after) - 1]
if title != name:
var = title + " " + name
print(var)
else:
print(title)
Results:
>>> Dr Green
Id prefer a solution that is configurable as the text I'm searching for can be dynamic so Dr Green could be replaced by a name with 4 words or 1 word.
Upvotes: 0
Views: 1376
Reputation: 2891
How about slicing the list at start and end, then just splitting it?
words = "Your meeting with Dr Jebediah Caruseum Green at 8pm"
start = "with"
end = "at"
list_of_stuff = words[words.index(start):words.index(end)].replace(start, '', 1).split()
list_of_stuff
['Dr', 'Jebediah', 'Caruseum', 'Green']
You can do anything you like with the list. For example I would parse for title like this:
list_of_titles = ['Dr', 'Sr', 'GrandMaster', 'Pleb']
try:
title = [i for i in list_of_stuff if i in list_of_titles][0]
except IndexError:
#title not found, skipping
title = ''
name = ' '.join([x for x in list_of_stuff if x != title])
print(title, name)
Upvotes: 0
Reputation: 2085
Sounds like a job for regular expressions. This uses the pattern (?:with)(.*?)(?:at)
to look for 'with', and 'at', and lazily match anything in-between.
import re
words = 'Your meeting with Dr Green at 8pm'
start = 'with'
end = 'at'
pattern = r'(?:{})(.*?)(?:{})'.format(start, end)
match = re.search(pattern, words).group(1).strip()
print(match)
Outputs;
Dr Green
Note that the Regex does actually match the spaces on either side of Dr Green
, I've included a simple match.strip()
to remove trailing whitespace.
Upvotes: 3
Reputation: 3822
Using RE
import re
words = "Your meeting with Dr Green at 8pm"
before = "Dr"
after = "at"
result = re.search('%s(.*)%s' % (before, after), words).group(1)
print before + result
Output :
Dr Green
Upvotes: 0