Reputation: 2443
In my brain, the following:
>>> re.sub('([eo])', '_\1_', 'aeiou')
should return:
'a_e_i_o_u'
instead it returns:
'a_\x01_i_\x01_u'
I'm sure I'm having a brain cramp, but I can't for the life of me figure out what's wrong.
Upvotes: 1
Views: 416
Reputation: 180391
Use raw string r:
re.sub('([eo])', r'_\1_', 'aeiou')
Output:
In [3]: re.sub('([eo])', r'_\1_', 'aeiou')
Out[3]: 'a_e_i_o_u'
In [4]: "\1"
Out[4]: '\x01'
In [5]: r"\1"
Out[5]: '\\1'
Upvotes: 2
Reputation: 1121544
\1
produces \x01
in Python string literals. Double the slash, or use a raw string literal:
>>> import re
>>> re.sub('([eo])', '_\1_', 'aeiou')
'a_\x01_i_\x01_u'
>>> re.sub('([eo])', '_\\1_', 'aeiou')
'a_e_i_o_u'
>>> re.sub('([eo])', r'_\1_', 'aeiou')
'a_e_i_o_u'
See The Backslash Plague in the Python regex HOWTO:
As stated earlier, regular expressions use the backslash character (
'\'
) to indicate special forms or to allow special characters to be used without invoking their special meaning. This conflicts with Python’s usage of the same character for the same purpose in string literals.
Upvotes: 4