Yannick
Yannick

Reputation: 89

what's the difference between operators when used multiple re.FLAG

there are 3 regex patterns, as bellow.

r1=re.compile(r"""
    # com
    abc
    # tehis
    \s+\d+
    """, re.S|re.X)

r2=re.compile(r"""
    # com
    abc
    # tehis
    \s+\d+
    """, re.S+re.X)  

r3=re.compile(r"""
    # com
    abc
    # tehis
    \s+\d+
    """, re.S&re.X)

and string to be matched is as fellow.

>>> s
'abc\n    899'

the search and match result is shown as bellow.

>>> s
'abc\n    899'
>>> r1.findall(s)
['abc\n    899']
>>> r2.findall(s)
['abc\n    899']
>>> r3.findall(s)
[]

we see r3 matched failed, while r1 and r2 successed. so what's the difference between different operators when used multiple re.FLAGS?

Upvotes: 1

Views: 27

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521083

The re flags, such as re.I, appear to be integer values. But, they are intepreted as binary bit masks. So, here is what your masks are actually equal to:

re.S | re.X = 80
re.S + re.X = 80
re.S & re.X = 0

Here is what the value 80 is in binary:

1010000

And here are the values for re.S (64) and re.X (16) in binary:

re.S = 1000000
re.X = 0010000
       1010000 <--- 80 in decimal

It should be clear from above that re.S | re.X and re.S + re.X happen to cover the bits from both flags, while re.S & re.X does not. In this case, integer addition + and bitwise OR | generate the same result.

Upvotes: 1

Related Questions