Reputation: 908
I have a string
"abc INC\","None", "0", "test"
From this string I want to replace any occurrence of backslash when it appears before " with a pipe |. I wrote the following code but it actually takes out " and leaves the \ behind.
import re
str = "\"abc INC\\\",\"None\", \"0\", \"test\""
str = re.sub("(\\\")", "|", str)
print(str)
Output: |abc INC\|,|None|, |0|, |test|
Desired Output: "abc INC|","None", "0", "test"
Can someone point out what am I doing wrong?
Upvotes: 3
Views: 4439
Reputation: 7384
For literal backslashes in python regexes you need to escape twice, giving you the pattern '\\\\"'
or "\\\\\""
. The first escaping is needed for python to actually put a backslash into the string. But regex patterns themself use backshlashes as a special character (for things like \w
word characters, etc.). The documentation states:
The special sequences consist of '\' and a character from the list below. If the ordinary character is not on the list, then the resulting RE will match the second character.
So the pattern \"
will match a single "
because "
is not a character with a special meaning there.
You can use the raw notation to only escape once: r'\\"'
.
Upvotes: 0
Reputation: 1957
This must solve your problem:
import re
s = "\"abc INC\\\",\"None\", \"0\", \"test\""
s = re.sub(r"\\", "|", s)
Also don't use str as a variable name, it is a reserved keyword.
Upvotes: 0
Reputation: 37003
See Jamie Zawinksi's famous quote about regular expressions. Try to only resort to the use of re's when absolutely necessary. In this case, it isn't.
The actual content of string str
(bad name for a variable, by the way, since there's a built-in type of that name) is
"abc INC\","None", "0", "test"
Why not just
str.replace('\\"', '|"')
which will do exactly what you want.
Upvotes: 3
Reputation: 78546
You can use the following positive lookahead assertion '\\(?=")'
:
import re
my_str = "\"abc INC\\\",\"None\", \"0\", \"test\""
p = re.sub(r'\\(?=")', '|', my_str)
print(p)
# '"abc INC|","None", "0", "test"'
Try not to use builtin names as names for variables, viz. str
, to avoid shadowing the builtin.
Upvotes: 0