r0xette
r0xette

Reputation: 908

Want to replace backslash in python3 string

I have a string

"abc INC\","None", "0", "test"

From this string I want to replace any occurrence of backslash when it appears before " with a pipe |. I wrote the following code but it actually takes out " and leaves the \ behind.

import re
str = "\"abc INC\\\",\"None\", \"0\", \"test\""
str = re.sub("(\\\")", "|", str)
print(str)

Output: |abc INC\|,|None|, |0|, |test|
Desired Output: "abc INC|","None", "0", "test"

Can someone point out what am I doing wrong?

Upvotes: 3

Views: 4439

Answers (4)

syntonym
syntonym

Reputation: 7384

For literal backslashes in python regexes you need to escape twice, giving you the pattern '\\\\"' or "\\\\\"". The first escaping is needed for python to actually put a backslash into the string. But regex patterns themself use backshlashes as a special character (for things like \w word characters, etc.). The documentation states:

The special sequences consist of '\' and a character from the list below. If the ordinary character is not on the list, then the resulting RE will match the second character.

So the pattern \" will match a single " because " is not a character with a special meaning there.

You can use the raw notation to only escape once: r'\\"'.

Upvotes: 0

Carlos Afonso
Carlos Afonso

Reputation: 1957

This must solve your problem:

import re
s = "\"abc INC\\\",\"None\", \"0\", \"test\""
s = re.sub(r"\\", "|", s)

Also don't use str as a variable name, it is a reserved keyword.

Upvotes: 0

holdenweb
holdenweb

Reputation: 37003

See Jamie Zawinksi's famous quote about regular expressions. Try to only resort to the use of re's when absolutely necessary. In this case, it isn't.

The actual content of string str (bad name for a variable, by the way, since there's a built-in type of that name) is

"abc INC\","None", "0", "test"

Why not just

str.replace('\\"', '|"')

which will do exactly what you want.

Upvotes: 3

Moses Koledoye
Moses Koledoye

Reputation: 78546

You can use the following positive lookahead assertion '\\(?=")':

import re

my_str = "\"abc INC\\\",\"None\", \"0\", \"test\""
p = re.sub(r'\\(?=")', '|', my_str)
print(p)
# '"abc INC|","None", "0", "test"'

Try not to use builtin names as names for variables, viz. str, to avoid shadowing the builtin.

Upvotes: 0

Related Questions