Reputation: 641
I'm new to Elisp and I need to convert a piece of LaTeX code to XML.
LaTeX:
\tag[read=true]{Please help}\tag[notread=false]{Please help II}
XML:
<tag read='true'> Please help </tag>
<tag notread='false'> please help </tag>
I wrote some regex to search and find \tag
but now I need to somehow read read
and notread
and assign them as attributes and then read their value after "=".
The regex I have tried:
[..] (while (re-search-forward "\\\\\\<tag\\>\\[" nil t) [..]
Upvotes: 1
Views: 182
Reputation: 189487
This is not a full solution, but hopefully demonstrates how to use backreferences with regular expressions.
Briefly, every group you create with \\(...\\)
in the regex is captured, and can be recalled with (match-string N)
, where N
is the sequential number of the group, starting from 1 for the leftmost opening parenthesis, and proceeding so that each opening parenthesis gets a number one higher than the previous.
(So if you have alternations, some backreferences will be undefined. If you apply the regex "\\(foo\\)\\|\\(bar\\)"
to the string "bar"
, (match-string 1)
will be empty, and (match-string 2)
will be "bar"
.)
(while
(re-search-forward
"\\\\\\<\\(tag\\)\\>\\[\\([^][=]*\\)=\\([^][]*\\)\\]{\\([^}]*\\)}"
nil t)
(insert (concat "<" (match-string 1) " "
(match-string 2) "='" (match-string 3) "'>"
(match-string 4)
"</" (match-string 1) ">\n") ) )
That regex certainly is a monster; you might want to decompose and document it somewhat.
(defconst latex-to-xml-regex
(concat "\\\\" ; literal backslash
"\\<" ; word boundary (not really necessary)
"\\(tag\\)" ; group 1: capture tag
"\\[" ; literal open square bracket
"\\(" ; group 2: attribute name
"[^][=]*" ; attribute name regex
"\\)" ; group 2 end
"=" ; literal
"\\(" ; group 3: attribute value
"[^][]*" ; attribute value regex
"\\)" ; group 3 end
"\\]" ; literal close square bracket
"{" ; begin text group
"\\(" ; group 4: text
"[^}]*" ; text regex
"\\)" ; group 4 end
"}" ; end text group
) "Regex for `latex-to-xml` (assuming your function is called that)")
Upvotes: 1