Alcott
Alcott

Reputation: 18585

[python]: problem about python string literals

code goes below:

line = r'abc\def\n'
rline = re.sub('\\\\', '+', line) # then rline should be r'abc+def+n'

Apparently, I just want to replace the backslashes in line with '+'. What I thought was that a backslash in line can be expressed as '\', then why should I use '\\' to get the re.sub work right.

I'm confused.

Upvotes: 3

Views: 934

Answers (3)

Owen
Owen

Reputation: 39356

Because there are two levels of backslashing:

  1. re.sub uses \ as an escape
  2. Python uses \ as an escape (unless you do r'...')

So \\\\ (python) -> \\ (re.sub) -> \

EDIT

And the SO level of backslashing! (it got me!)

Upvotes: 4

millimoose
millimoose

Reputation: 39950

If you want to search for a literal pattern, not an actual regular expression, you should use both raw strings and re.escape() to avoid doubling backslashes or any other manual escaping completely.

So, your example would become:

line = r'abc\def\n'
backslash = re.escape(r'\')
rline = re.sub(backslash, '+', line)

Upvotes: 2

unutbu
unutbu

Reputation: 879291

It's a good habit to always use raw strings when dealing with regex patterns:

In [45]: re.sub(r'\\', r'+', line)
Out[45]: 'abc+def+n'

To answer your question though, Python interprets '\\\\' as two backslash characters:

In [44]: list('\\\\')
Out[44]: ['\\', '\\']

And the rules of regex interpret two backslash characters as one literal backslash.

Upvotes: 7

Related Questions