Alex
Alex

Reputation: 44325

Why does this regex not match my groups in python?

I have the following complete code example

import re

examples = [
    "D1",       # expected: ('1')
    "D1sjdgf",  # ('1')
    "D1.2",     # ('1', '2')
    "D1.2.3",   # ('1', '2', '3')
    "D3.10.3x", # ('3', '10', '3')
    "D3.10.11"  # ('3', '10', '11')
]

for s in examples:
    result = re.search(r'^D(\d+)(?:\.(\d+)(?:\.(\d+)))', s)
    print(s, result.groups())

where I want to match the 1, 2 or 3 numbers in the expression always starting with the letter "D". It could be 1 of them, or 2, or three. I am not interested in anything after the last digit.

I would expect that my regex would match e.g. D3.10.3x and return ('3','10','3'), but instead returns only ('3',). I do not understand why.

^D(\d+\)(?:\.(\d+)(?:\.(\d+)))

I also do not know what a "non-capturing" group means in that context as for this answer.

Upvotes: 0

Views: 86

Answers (1)

anubhava
anubhava

Reputation: 785146

You may use this regex solution with a start anchor and 2 capture groups inside the nested optional capture groups:

^D(\d+)(?:\.(\d+)(?:\.(\d+))?)?

RegEx Demo

Explanation:

  • ^: Start
  • D: Match letter D
  • (\d+): Match 1+ digits in capture group #1
  • (?:: Start outer non-capture group
    • \.: Match a dot
    • (\d+): Match 1+ digits in capture group #2
    • (?:: Start inner non-capture group
      • \.: Match a dot
      • (\d+): Match 1+ digits in capture group #3
    • )?: End inner optional non-capture group
  • )?: End outer optional non-capture group

Code Demo:

import re

examples = [
    "D1",       # expected: ('1')
    "D1sjdgf",  # ('1')
    "D1.2",     # ('1', '2')
    "D1.2.3",   # ('1', '2', '3')
    "D3.10.3x", # ('3', '10', '3')
    "D3.10.11"  # ('3', '10', '11')
]

rx = re.compile(r'^D(\d+)(?:\.(\d+)(?:\.(\d+))?)?')

for s in examples:
    result = rx.search(s)
    print(s, result.groups())

Output:

D1 ('1', None, None)
D1sjdgf ('1', None, None)
D1.2 ('1', '2', None)
D1.2.3 ('1', '2', '3')
D3.10.3x ('3', '10', '3')
D3.10.11 ('3', '10', '11')

Upvotes: 1

Related Questions