Reputation: 347
The following code does what I want to do:
one_sentence = lambda x: re.search(r'b|c|d', x)
As as well as the following:
if re.search(r'P' + chr(8868), 'aP' + chr(8868)):
print (True)
But I cannot get the following to work:
if re.search(chr(8835)|chr(8868)|chr(8869), 'P' + chr(8868)):
print (True)
I'm trying to make it so that if either of chr(8835)
or chr(8868)
or chr(8869)
are in a string, then the code prints True
.
Upvotes: 1
Views: 71
Reputation: 19544
For the pipe |
character to operate in the regular expression it needs to be a part of the pattern string (as you have in the first example re.search(r'b|c|d', x)
). However, you are using it as a Python operator instead here:
>>> re.search(chr(8835)|chr(8868)|chr(8869), 'P' + chr(8868))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'str' and 'str'
Which is why you get an error. The |
operator is a "bitwise or" and can't be applied to two strings. Instead you need to use strings containing pipes:
>>> re.search(chr(8835) + '|' + chr(8868) + '|' + chr(8869), 'P' + chr(8868))
<_sre.SRE_Match object; span=(1, 2), match='⊤'>
Or if you'd prefer, you can enter the hex values of the unicode characters straight into the string using the \uXXXX
syntax, and include the pipes directly:
>>> hex(8835)
'0x2283'
>>> hex(8868)
'0x22a4'
>>> hex(8869)
'0x22a5'
>>>
>>> '\u2283|\u22a4|\u22a5'
'⊃|⊤|⊥'
>>> re.search('\u2283|\u22a4|\u22a5', 'P\u22a4')
<_sre.SRE_Match object; span=(1, 2), match='⊤'>
Upvotes: 1