cake3d
cake3d

Reputation: 3

Transform XML to LaTeX

In Scala, how can I transform:

<p>here we have a <a href="http://www.scala-lang.org/api/current/index.html">link</a> example.</p>

to

here we have a \url{http://www.scala-lang.org/api/current/index.html}{link} example.

where <p></p> maps to "nothing", and <a href"_">_</> maps to \url{_}{_}

Upvotes: 0

Views: 758

Answers (3)

DarrenWang
DarrenWang

Reputation: 478

More generic way is using parsers, like scala's parser combinator, or available ones of java. if the file is well-formed xml, the way to process xml is ok too.

Upvotes: 0

Debilski
Debilski

Reputation: 67888

As an alternative, if you need more transformations*, you can start with this. It will also work with nested <a/> tags, whatever sense this may make.

There’s some need of escape handling in the code. E.g. some characters are escaped in XML which are not escaped in Latex and the other way round. Feel free to add this.

import xml._

val input = <p>And now try it on a <a href="link1">text</a> with <a href="link2">two urls</a></p>

def mkURL(meta: MetaData, text: String) = {
  val url = meta.asAttrMap.get("href")
  "\\url{%s}{%s}".format(url getOrElse "", text)
}

def transform(xhtml: NodeSeq): String = {
  xhtml.map { node =>
    node match {
      case Node("p", _, ch@_*) => transform(ch)
      case Node("a", meta, ch@_*) => mkURL(meta, transform(ch))
      case x => x.toString
    }
  } mkString
}

println(transform(input))

// And now try it on a \url{link1}{text} with \url{link2}{two urls}

[*] Adding support for \emph would be something like

case Node("em", _, ch@_*) => transform(ch).mkString("\\emph{", "", "}")

Upvotes: 3

Vasil Remeniuk
Vasil Remeniuk

Reputation: 20627

Define regexps:

scala> val link = """<a href="(.+)">(.+)</a>""".r
link: scala.util.matching.Regex = <a href="(.+)">(.+)</a>

scala> val paragraph = """<p>(.+)</p>""".r
paragraph: scala.util.matching.Regex = <p>(.+)</p>

scala> val text = """<p>here we have a <a href="http://www.scala-lang.org/api/current/index.html">link</a> example.</p>"""
text: java.lang.String = <p>here we have a <a href="http://www.scala-lang.org/api/current/index.html">link</a> example.</p>

Apply them to the input:

scala> val modifiedText = paragraph.replaceAllIn(text, {matched => val paragraph(content) = matched; content})
modifiedText: String = here we have a <a href="http://www.scala-lang.org/api/current/index.html">link</a> example.

scala> link.replaceAllIn(modifiedText, {matched => val link(href, title) = matched; "\\\\url{%s}{%s}" format(href, title)})
res11: String = here we have a \url{http://www.scala-lang.org/api/current/index.html}{link} example.

Upvotes: -1

Related Questions