Reputation: 33
I have a bunch of input string in the following (simplified) format:
"Hello my name is Dan"
"Hey my name is Tony"
"Hey|Hello|Hi my name is _"
I'm trying to write a regular expression to extract the name from the previous examples, but I'm stuck on how to do it.
I currently have
import re
r = re.search("(Hello|Hey|Hi) my name is .+")
How do I actually get the captured name?
Upvotes: 3
Views: 216
Reputation: 705
You're actually not too far off. You're missing the supplied text to the regex, but I'm guessing you actually want to compile one to use later
import re
r = re.compile("(Hello|Hey|Hi) my name is (.+)")
... later
match = r.search(text)
if match:
name = match.groups()[1]
What's going on here is that everything surrounded by parens is treated as a captured group. You can refer to it later if there's a match. You can also have named matches by using (?P<name>PATTERN)
Upvotes: 2
Reputation: 3158
Use groups to retrieve the part of your regex match. Here is an improved variant of your codelet:
import re
str = "Hello my name is Dan"
r = re.search("(Hello|Hey|Hi) my name is (.+)", str)
name = r.group(2)
I added parenthesis around (.+) so that they can be referred by the search object. group(0) is the complete matched string. group(1) is the first group - either of Hello, Hey or Hi. group(2) is the name.
Upvotes: 1
Reputation: 48287
You may use (\w+)
match grouping.
But if "my name is" is expected to be in your strings, why not to use something alike
r.split('my name is ', 1)[1].split(' ', 1)[0]
Upvotes: 1
Reputation: 8492
Try this:
import re
r = re.search("(?:Hello|Hey|Hi) my name is (\w+)", "Hello my name is Tony")
print r.groups()[0]
prints Tony
.
Upvotes: 1