Match the € symbol in re with python

Question

I'm trying to match the € symbol in a string but I get a strange behaviour when using the special character "?". It works as expected with normal characters

import re

print re.match(r'a?1', 'a1')
<_sre.SRE_Match object at 0x3a2ba58>

print re.match(r'a?1', '1')
<_sre.SRE_Match object at 0x3a2ba58>

but with the € symbol I get this output

print re.match(r'€?1', '€1')
<_sre.SRE_Match object at 0x3a2ba58>

print re.match(r'€?1', '1')
None

Any idea about what's going on? I suspect it's something related to unicode. I'm using python 2.7. Thank you.

dwitvliet · Accepted Answer

€ is not an ascii character, so you need to use unicode matching:

print re.match(ur'€?1', u'€1', flags=re.UNICODE)
<_sre.SRE_Match object at 0x7ffde0084bf8>

print re.match(ur'€?1', u'1', flags=re.UNICODE)
<_sre.SRE_Match object at 0x7ffde0084bf8>

Match the € symbol in re with python

Answers (2)

Related Questions