Reputation: 243
I have a text file which contains string like
<disp-formula id="deqn*"><text-notation="math">\begin{equation*}
x=5 \tag{5}
y=3 \tag{6}
x+y=8 \tag {7}
\end{equation*}</text-notation="math"></disp-formula>
<disp-formula id="deqn*"><text-notation="math">\begin{equation*}
x+y=5 \tag{3}
\end{equation*}</text-notation="math"></disp-formula>
<disp-formula id="deqn*"><text-notation="math">\begin{equation*}
a+y=15 \tag {4a}
\end{equation*}</text-notation="math"></disp-formula>
<disp-formula id="deqn*"><text-notation="math">\begin{equation*}
x=5 \tag {9a}
y=3 \tag{10}
x+y=8 \tag{11}
\end{equation*}</text-notation="math"></disp-formula>
...etc
I'm trying to convert them to
<disp-formula id="deqn5-7"><text-notation="math">\begin{equation*}
x=5 \tag{5}
y=3 \tag{6}
x+y=8 \tag {7}
\end{equation*}</text-notation="math"></disp-formula>
<disp-formula id="deqn3"><text-notation="math">\begin{equation*}
x+y=5 \tag{3}
\end{equation*}</text-notation="math"></disp-formula>
<disp-formula id="deqn4a"><text-notation="math">\begin{equation*}
a+y=15 \tag {4a}
\end{equation*}</text-notation="math"></disp-formula>
<disp-formula id="deqn9a-11"><text-notation="math">\begin{equation*}
x=5 \tag {9a}
y=3 \tag{10}
x+y=8 \tag{11}
\end{equation*}</text-notation="math"></disp-formula>
...etc
using a couple of regex replace on the file. The first regex replace looks like
(?s)(<disp-formula id="deqn)[^"]*?("(?:.(?!/disp-formula))+?.\\tag\s?\{)([^}]+?)(\}(?:.(?!/disp-formula))+.\\tag\s?\{)([^}]+?)\}
which is replaced by
$1$3-$5$2$3$4$5}
and the second regex is
(?s)(<disp-formula id="deqn)[^"]*?("(?:.(?!/disp-formula|\\tag))+?.\\tag\s?\{)([^}]+?)(\}(?:.(?!/disp-formula|\\tag))+?</disp-formula>)
which will be replace by
$1$3$2$3$4
Both the regex have been tested using http://regexstorm.net/tester and it works but when I try to use it in my code it does not work.
I'm struggling to escape some characters in my regex I think, can anyone help me here is my code
string content=File.ReadAllText(@"D:\test\00057_po.txt");
string pattern1 = "(?s)(<disp-formula id=\"deqn)[^\"]*?(\"(?:.(?!/disp-formula))+?.\\tag\\s?{{)([^}]+?)(}}(?:.(?!/disp-formula))+.\\tag\\s?{{)([^}]+?)}}";
string replacement1 = "$1$3-$5$2$3$4$5}}";
string pattern2="(?s)(<disp-formula id=\"deqn)[^\"]*?(\"(?:.(?!/disp-formula|\\tag))+?.\\tag\\s?{{)([^}]+?)(}}(?:.(?!/disp-formula|\\tag))+?</disp-formula>)";
string replacement2 = "$1$3$2$3$4";
Regex rgx = new Regex(pattern1);
Regex rgx2 = new Regex(pattern2);
string result1 = rgx.Replace(content, replacement1);
string result2 = rgx2.Replace(result1, replacement2);
File.WriteAllText(@"D:\test\00057_po.txt",result2);
Upvotes: 1
Views: 106
Reputation: 503
Try these
string pattern1 = "(?s)(<disp-formula id=\"deqn)[^\"]*?(\"(?:.(?!/disp-formula))+?.\\\\tag\\s?\\{)([^\\}]+?)(\\}(?:.(?!/disp-formula))+.\\\\tag\\s?\\{)([^}]+?)(?=\\})";
string replacement1 = "$1$3-$5$2$3$4$5";
string pattern2="(?s)(<disp-formula id=\"deqn)[^\"]*?(\"(?:.(?!/disp-formula|\\\\tag))+?.\\\\tag\\s?\\{)([^\\}]+?)(\\}(?:.(?!/disp-formula|\\\\tag))+?</disp-formula>)";
string replacement2 = "$1$3$2$3$4";
Upvotes: 1
Reputation:
You need to escape {} aswell like:
"(?+s)(<disp-formula id=\"deqn)[^\"]*?(\"(?:.(?!\\/disp-formula))+?.\\tag\\s?\\{\\{)([^\\}]+?)(\\}\\}(?:.(?!/disp-formula))+.\\tag\\s?\\{\\{)([^\\}]+?)\\}\\}";
Upvotes: 0