Reputation: 149
I am trying to replace whitespaces, in latex that is contained in a markdown document, with \\;
using regex.
In the md package I'm using, all latex is wrapped in either $
or $$
I would like to change the following from
"dont edit this $result= \frac{1}{4}$ dont edit this $$some result=123$$"
to this
"dont edit this $result=\\;\frac{1}{4}$ dont edit this $$some\\;result=123$$"
I have managed to do it using the messy function below but would like to use regex for a cleaner approach. Any help would be appreciated
import re
vals = r"dont edit this $result= \frac{1}{4}$ dont edit this $$some result=123$$"
def cleanlatex(vals):
vals = vals.replace(" ", " ")
char1 = r"\$\$"
char2 = r"\$"
indices = [i.start() for i in re.finditer(char1, vals)]
indices += [i.start() for i in re.finditer(char2, vals.replace("$$","~~"))]
indices.sort()
print(indices)
# check that no of $ or $$ are even
if len(indices) % 2 == 0:
while indices:
start = indices.pop(0)
finish = indices.pop(0)
vals = vals[:start] + vals[start:finish].replace(' ', '\;') + vals[finish:]
vals = vals.replace(" ", " ")
return vals
print(cleanlatex(vals))
Output:
[18, 39, 60, 78]
dont edit this $result=\\;\frac{1}{4}$ dont edit this $$some\\;result=123$$
Upvotes: 0
Views: 586
Reputation: 149
I never thought of lambda! Thank you @trincot your answer covers things I didn't even know were possible with regex. I am trying to decipher the pattern and would love some clarification if you can? I'd really appreciate it as I've had a look at re docs but am still confused by the following
Thanks again for the reply
Upvotes: 0
Reputation: 350167
With regex I would still do it in two steps:
replace
calldef cleanlatex(vals):
return re.sub(r"(\$\$?)(.*?)\1", lambda m: m[0].replace(" ", r"\;"), vals)
If the dollars don't match up, this will still make replacements, up until no more pair of matching dollars is found. This is a different behaviour from how your code works where nothing is replaced when the dollars don't match.
When dollars are "nested", like in "$$nested $ here$$", then the inner dollar will not be regarded as a delimiter in this solution. Or if a double dollar happens to follow a single dollar, the double one will be interpreted as two single dollars that just happen to follow each other. So "$part one$$part two$" will identify two parts, each delimited with a single dollar.
Your question didn't give any such boundary conditions (there are quite a few of them), so the solution may need some adaptations.
Upvotes: 2