Gary Fixler
Gary Fixler

Reputation: 6028

Having trouble with regex groups

>>> a = re.search('(\\d+h)?(\\d+m)?(\\d+s)?', 'in 1h15m')
>>> a.groups()
(None, None, None)
>>> a = re.search('.*(\\d+h)?(\\d+m)?(\\d+s)?', 'in 1h15m')
>>> a.groups()
(None, None, None)
>>> a = re.search('...(\\d+h)?(\\d+m)?(\\d+s)?', 'in 1h15m')
>>> a.groups()
('1h', '15m', None)

Why is the '...' version the only one that populates 'groups'?

Upvotes: 1

Views: 76

Answers (1)

Kobi
Kobi

Reputation: 138017

Why are we getting empty groups?

First one - a?a?a matched with "bbbaaa":

  • Start matching on the first position.
  • Try to match a, but can't find. That's Ok, it's optional, so match nothing. (x 3 times)
  • Return a successful empty match, with no successful captured groups.

Second one - .*a?a?a matched with "bbbaaa":

  • match .* from the first position to the last position.
  • Now we are at the end of the string.
  • Try to match a, but can't find it. That's Ok, it's optional, so match nothing. (x 3 times)
  • Return a successful match with all original text, with no successful captured groups.

How to solve this issue?

It is unclear what exactly you are trying to do, but you can match for:

\d+h(\d+m)?(\d+s)?|(\d+m)(\d+s)?|(\d+s)

This assures you match at least one element - at least on option is not optional. Then, the regex would fail to match if none of the groups is available. You can parse it in a second step to get the groups, or use the group alternation feature (?| | ) if it is supported.

Upvotes: 2

Related Questions