user740316
user740316

Reputation:

Replacing a string by a string with backslashes

I am creating a program that automatically generates my reports in LaTeX, where I have to escape special LaTeX characters. Basically, whenever I read $ or _ or %, etc, I have to replace it by \$, \_ and \%, respectively.

I naively tried to do mystring.replace('$','\$'), yet it adds a double backslash, as shown below:

my_text_to_parse = "$x^2+2\cdot x + 2 = 0$"
my_text_to_parse.replace('$','\$')
#=> "\\$x^2+2\cdot x + 2 = 0\\$"

Is there any way to avoid doubling escape characters?

Upvotes: 1

Views: 262

Answers (2)

Roger Fan
Roger Fan

Reputation: 5045

You're seeing the double backslash because you're getting the representation of the string, not the output. In the representation, it prints an backslash because \ is a protected character and therefore must be escaped. This is because it is used in special characters (e.g. \t, \n) and usage might be confused.. When the string is actually printed or saved, those double backslashes should be printed properly as a single backslash.

For example, compare

print('\')
# SyntaxError: EOL while scanning string literal

to

print('\\')
# \

In the first string, the second quotation mark is being escaped by the backslash. This shows why you generally can't use raw backslashes in strings. In the second string, the second backslash is being escaped by the first. The two backslashes get interpreted as a single one.

print(repr('\\'))
# '\\'

But the representation of the second string still shows both backslashes. This behavior is the same as other special characters such as \n, where it can be a bit easier to see the issue. Just as \n is the special character that means line break, \\ is the special character that means single backslash.

print('hi\nmom')
# hi
# mom

print(repr('hi\nmom'))
# 'hi\nmom'

To actually answer your question, the way you're doing it should work properly, but you probably don't want to do it quite that way. This is because creating a string with '\$' doesn't make this escaping issue clear. It seems like it is a special character \$ in the same way that \n is a special character, but because there is no character defined like that, the python interpreter is smart enough to replace the single backslash with a double backslash. But you generally don't want to rely on that behavior.

A better way to do it is to explicitly escape the backslash with another one or to use a raw string, where no escaping is allowed. All of these will give the same result.

s = '$x^2+2\\cdot x + 2 = 0$'

print(s.replace('$', '\$'))   # Technically works, but not as clear
# \$x^2+2\cdot x + 2 = 0\$

print(s.replace('$', '\\$'))  # Escaping the backslash
# \$x^2+2\cdot x + 2 = 0\$

print(s.replace('$', r'\$'))  # Using a raw string
# \$x^2+2\cdot x + 2 = 0\$

Upvotes: 2

vks
vks

Reputation: 67968

print re.sub(r"\$","\$",x)

You can try re.sub.It will give the expected result.

Upvotes: 0

Related Questions