Reputation: 9
I am a complete newbie to Python and after any help possible. Below is sample text string which I am trying to extract 2 substrings:
Sample text: Your booking at Crown Street - June 29th, 1:00pm
The Location substring is between the following 2 phrases were are constant "Your booking at " and " -". The spaces includes in the phrases are deliberate. In this example, my required output string is Crown Street. What is the best Python regex to deliver this outcome?
The Timestamp substring procedes "- " expression in the string. In this example, my required output string is June 29th, 1:00pm. What is the best Python regex to deliver this outcome?
Upvotes: 0
Views: 430
Reputation: 82785
Using re.search
Demo:
import re
text = "Your booking at Crown Street - June 29th, 1:00pm"
data = re.search("Your booking at\s+(.*)\s+\-\s+(.*)", text)
if data:
print(data.group(1))
print(data.group(2))
Output:
Crown Street
June 29th, 1:00pm
Upvotes: 0
Reputation: 169267
import re
example = 'Your booking at Crown Street - June 29th, 1:00pm'
regex = re.compile(r'Your booking at (?P<location>.+) - (?P<timestamp>.+)$')
print(regex.match(example).groupdict())
outputs
{'location': 'Crown Street', 'timestamp': 'June 29th, 1:00pm'}
Notice that this could end up in a false match if there's a -
in the name of the location; if you're always sure there'll be an English month to start the timestamp, you could use (?P<timestamp>(?:Jan|Feb|Mar|...).+)
.
Upvotes: 1