Christoph Wurm
Christoph Wurm

Reputation: 1072

Python regex: How can I match start of string in a selection?

I want to match some digits preceded by a non-digit or at the start of the string.

As the caret has no special meaning inside brackets I can't use that one, so I checked the reference and discovered the alternate form \A.

However, when I try to use it I get an error:

>>> s = '123'
>>> re.findall('[\D\A]\d+', s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
    return _compile(pattern, flags).findall(string)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 245, in _compile
    raise error, v # invalid expression
sre_constants.error: internal: unsupported set operator

What am I doing wrong?

Upvotes: 0

Views: 1911

Answers (2)

Andrew Clark
Andrew Clark

Reputation: 208425

Repetition in regular expressions is greedy by default, so using re.findall() with the regex \d+ will get you exactly what you want:

re.findall(r'\d+', s)

As a side note, you should be using raw strings when writing regular expressions to make sure the backslashes are interpreted properly.

Upvotes: 0

Qtax
Qtax

Reputation: 33908

You can use a negative lookbehind:

(?<!\d)\d+

Your problem is that you are using \A (a zero width assertion) in a character class, which is for matching a single character. You could write it like (?:\D|\A) instead, but a lookbehind is nicer.

Upvotes: 2

Related Questions