Reputation: 1235
I'm trying to extract out the number before the -
and the rest of the string after it, but it's not able to extract out both. Here's the output from the interactive terminal:
>>> a = '#232 - Hello There'
>>> re.findall('#(.*?) - (.*?)', a)
[('232', '')]
Why is my regex not working properly?
Upvotes: 1
Views: 1076
Reputation: 97575
Your regex is fine, you're just using the wrong function from re
. The following matches things correctly:
m = re.fullmatch('#(.*?) - (.*?)', a)
Upvotes: 0
Reputation: 41987
.*?
is non-greedy i.e. it will match the smallest substring, you need the greedy version i.e. .*
(matches longest substring) for the latter one:
In [1143]: a = '#232 - Hello There'
In [1144]: re.findall('#(.*?) - (.*?)', a)
Out[1144]: [('232', '')]
In [1145]: re.findall('#(.*?) - (.*)', a)
Out[1145]: [('232', 'Hello There')]
But you should use str
methods to process such simple cases e.g. using str.split
with splitting on -
:
In [1146]: a.split(' - ')
Out[1146]: ['#232', 'Hello There']
With str.partition
on -
and slicing:
In [1147]: a.partition(' - ')[::2]
Out[1147]: ('#232', 'Hello There')
Upvotes: 7