sobel
sobel

Reputation: 643

Possible bug in python re

I have the following code:

import re
r = re.compile(r'[*-/]')
print r.match('.') is not None

It prints True, signifying that '.' matches the given regular expression, which it doesn't. Am I missing something obvious in the regex?

I'm using cpython 2.7.3 on osx 10.8.2

If any of the three characters inside the [] set are removed, it works.

Upvotes: 1

Views: 249

Answers (2)

Fredrik Pihl
Fredrik Pihl

Reputation: 45670

Compile it using re.DEBUG

In [3]: r = re.compile(r'[*-/]', re.DEBUG)
in
  range (42, 47)

which gives the definition of the range. man ascii gives

42        *
43        +
44        ,
45        -
46        .
47        /

which includes a . hence perfectly legal.

Upvotes: 6

Jared
Jared

Reputation: 26437

When you write the following,

r = re.compile(r'[*-/]')

The use of - really means match any character between * and /. If you look at the ascii table,

*      42
+      43
,      44
-      45
.      46
/      47

And that is why it matches the . character. Your current regex also will match,

>>> print r.match('+')
<_sre.SRE_Match object at 0x100483370>
>>> print r.match(',')
<_sre.SRE_Match object at 0x100483370>

To correct the regex so that it only matches * or - or /, you can escape the - like this,

r = re.compile(r'[*\-/]')

Then you don't get a match for .

>>> print r.match('.') is not None
False

Upvotes: 11

Related Questions