Reputation: 350
This is a cross-post from TeX, but it did not get any answers there. And since I assume the problem has more to do with my understanding of regular expressions (or better, lack thereof) than with LaTeX itself, StackOverflow may have been the better place to ask to begin with.
I would like to use BibTool (which was written in C, if this is of any consequence here) to enclose some strings in a bib
-file in curly braces. The test bib
entry looks like this:
@Article{Cite1,
author = {Adelbert, A.},
date = {2020},
journaltitle = {A Journal},
title = {A title with just \textit{Test} structure and some chemistry \ce{CO2}},
number = {2},
pages = {1--4},
volume = {1},
}
I have created the following BibTool resource file:
resource {biblatex}
preserve.keys = on
preserve.key.case = on
rewrite.rule = {"\\\(.*{.*}\)" "{{\1}}"}
The rewrite.rule
is supposed to be the following:
\
, like \ce{}
, \textit{}
, etc. This is done by the \\
at the beginning of the regular expression.\(\)
: A random string at the beginning, followed by {
, a random string, followed by }
; i.e. the string textit{Test}
."{{\1}}"
.What it manages so far:
\
.So far, the code returns the following
@Article{Cite1,
Author = {Adelbert, A.},
Date = {2020},
JournalTitle = {A Journal},
Title = {A title with just {{textit{Test} structure and some chemistry {{ce{CO2}}}}}},
Number = {2},
Pages = {1--4},
Volume = {1},
}
You see it finds the strings and puts {{
at the beginning of each string. Unfortunately, it puts }}
at the end of the field, not the string, so I now have 6 curly braces at the end of the title field. The braces do match, just two of them should be after {{textit{Test}
not at the very end. I tried various constructions like rewrite.rule = {"\\\(.*{.*}\)$" "{{\1}}"}
, rewrite.rule = {"\\\(.*{.*}\) ?$" "{{\1}}"}
, rewrite.rule = {"\\\(.*{.*}\)*$" "{{\1}}"}
but this all did not work.
When trying to get the \
back at the beginning of the string, using rewrite.rule = {"\\\(.*{.*}\)" "{{\\\1}}"}
I get the \
back, but also thousands of {}
until I get a Rewrite limit exceeded
error.
I am not very good with regular expressions and would be happy for any comments.
Upvotes: 2
Views: 369
Reputation: 136
My approach would use two phases. In the first phase I would process the macro with one argument and replace in the result the \ by a replacement representation (here ##). In the second pahe I simply replace ## by \.
In BibTool this looks as follows:
rewrite.rule {"\\\(\([a-zA-Z]+\|.\){[^{}]*}\)" "{##\1}"}
rewrite.rule {"##" "\\"}
Note, that in general the task depicted can not be solved with regular expressions...
Upvotes: 2
Reputation: 3658
The behavior of .*
by default is to match as many characters as possible. This is called 'greedy matching' in regex terms.
Your pattern is likely matching the following on hitting the first \
:
\textit{Test} structure and some chemistry \ce{CO2}}
Replacing the text to:
{{textit{Test} structure and some chemistry \ce{CO2}}}}
And then finding the next \ and replacing:
\ce{CO2}}}} becomes {{ce{CO2}}}}}}
Total effect:
{A title with just \textit{Test} structure and some chemistry \ce{CO2}}
{A title with just {{textit{Test} structure and some chemistry {{ce{CO2}}}}}}
To change the behaviour in most regex flavors you can put a ?
after the quantifier: .*?
to make it 'lazy', that is match the least amount of characters.
Upvotes: 2