Beka Tchigladze
Beka Tchigladze

Reputation: 95

Regex to match the number zero in the string

Can someone help me to get zeros in a separate group?

import re

text = 'fafsah000012fafaa'

zeroRegex = re.compile(r'^(\w*)((0)*\d*)(\w*)$')

matches = zeroRegex.search(text)

print(matches.groups())

Upvotes: 2

Views: 501

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627100

You can use

zeroRegex = re.compile(r'^([a-zA-Z]*)(0*)(\d*)([a-zA-Z]*)$')
## Or, if you still need to match underscores and all Unicode letters and digits:
zeroRegex = re.compile(r'^([^\W\d]*)(0*)(\d*)([^\W\d]*)$')

See the regex demo and a Python demo online.

Note that \w* matches any zero or more letters, digits or underscores in a greedy way. Making it non-greedy (like \w*?) would not help here because all the patterns are optional in this pattern, and if the first \w*? is skipped, the next 0* and \d* would match at the very start of the string, and the last \w* or \w*? would grab the rest of the alphanumeric string.

Thus, the only way to get what you need and keep all subpatterns optional (so that even an empty string could be parsed with this regex), you need to "subtract" the digits from this pattern, and either use [A-Za-z]* to only match ASCII letters, or [^\W\d]*, to match any Unicode letters, underscores (and some other connector punctuation symbols, but that is usually not a problem).

Upvotes: 2

Related Questions