Reputation: 18187
For the following code:
t1 = 'tyler vs ryan'
p1 = re.compile('(.*?) vs (.*?)')
print p1.findall(t1)
the output is:
[('tyler', '')]
but I would've expected this:
[('tyler', 'ryan')]
I have found that if I add a delimiter I can get it to work:
t2 = 'tyler vs ryan!' # Notice the exclamation mark
p2 = re.compile('(.*?) vs (.*?)!') # Notice the exclamation mark
print p2.findall(t2)
outputs:
[('tyler', 'ryan')]
Is there a way I can get my matches without having a custom delimiter?
Upvotes: 2
Views: 140
Reputation: 2562
The non greedy ?
is preventing to capture te second word. It would be better to do
r'(.*) vs (.*)'
Upvotes: 0
Reputation: 18901
The regex is capturing the shortest string it can; that's what the question mark signifies. So as soon as it has captured the text vs
it captures an empty string, then stops. This is what it looks like:
Direct link: https://regex101.com/r/hO4lM7/2
If you use:
re.compile('(.*?) vs (.*)')
that is, without the 2nd question mark, it will capture the text after vs
as well.
Upvotes: 3
Reputation: 5236
If you are assured of single-name combatants, you could use a regex like:
r'\s*(\S+)\s*vs\s*(\S+)\s*'
Your use of findall() implies to me you're expecting to have to match multiple pairings - if not, then you may want to use search() and use the ^
and $
regex special characters to more tightly bound your search.
Upvotes: 2
Reputation: 852
No. Try this
t1 = 'tyler vs ryan'
p1 = re.compile('(.*?) vs (.*?)$')
print p1.findall(t1)
gives:
[('tyler', 'ryan')]
$ - Matches the end of the string or just before the newline at the end of the string, and in MULTILINE mode also matches before a newline.
Upvotes: 2
Reputation: 114038
(.*?)
is non greedy it will match the smallest it can which is the empty string (after the vs
at least)
try (.*)
or ([^ ]*)
or something
Upvotes: 4