Nemo XXX
Nemo XXX

Reputation: 681

pandoc command line parameters for resolving internal links

My problem is similar to this post, but not identical. I somehow can't figure out the correct pandoc command line parameters for maintaining/resolving cross-document links when using a couple of interlinked HTML files as the input.

Let's say I have two files, chapter1.xhtml and chapter2.xhtml located in the /home/user/Documents folder with the following contents:

<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<h3>Chapter 1</h3>
<p><a href="/home/user/Documents/chapter2.xhtml">Next chapter</a><br /></p>

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</body>
</html>

which contains a link to the next document.

and

<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<h3>Chapter 2</h3>
<p><a href="/home/user/Documents/chapter1.xhtml">Previous chapter</a><br /></p>

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</body>
</html>

which contains a link to the previous document.

I used the following command line parameters:

$ pandoc -s --toc --verbose -o /home/user/Documents/output.markdown /home/user/Documents/chapter1.xhtml /home/user/Documents/chapter2.xhtml

And I got the following output:

---
---

-   [Chapter 1](#chapter-1)
-   [Chapter 2](#chapter-2)

### Chapter 1

[Next chapter](/home/user/Documents/chapter2.xhtml)\

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

### Chapter 2

[Previous chapter](/home/user/Documents/chapter1.xhtml)\

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

This problem also occurs when I select docx or latex/pdf as the output format. I also tried to use relative links, but nothing worked.

What are the correct parameters for resolving cross-document links?

tl;dr I.e. I don't want link references that contain the original paths; I want them to point to the new output document.

Upvotes: 3

Views: 4652

Answers (1)

mb21
mb21

Reputation: 39179

The problem is that your links contain absolute paths (/home/user/Documents/chapter1.xhtml) instead of relative ones (chapter1.xhtml). I cannot imagine the ePUB file containing absolute paths, and if it does, the links in the file will only ever work correctly on your computer. So the solution will have to be fixing those ePUB files before feeding them to pandoc.

Note that roundtripping from pandoc from markdown to epub and back to html works as expected:

$ pandoc -o foo.epub
# foo

adfs

# bar

go [to foo](#foo)


$ unzip foo.epub

$ cat ch002.xhtml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta http-equiv="Content-Style-Type" content="text/css" />
  <meta name="generator" content="pandoc" />
  <title>bar</title>
  <link rel="stylesheet" type="text/css" href="stylesheet.css" />
</head>
<body>
<div id="bar" class="section level1">
<h1>bar</h1>
<p>go <a href="ch001.xhtml#foo">to foo</a></p>
</div>
</body>
</html>

$ pandoc foo.epub

<p><span id="ch001.xhtml"></span></p>
<div id="ch001.xhtml#foo" class="section level1">
<h1>foo</h1>
<p>adfs</p>
</div>
<p><span id="ch002.xhtml"></span></p>
<div id="ch002.xhtml#bar" class="section level1">
<h1>bar</h1>
<p>go <a href="#ch001.xhtml#foo">to foo</a></p>
</div>

P.S.

Using two input documents like:

pandoc -o output.md chapter1.xhtml chapter2.xhtml

works as the pandoc README states:

If multiple input files are given, pandoc will concatenate them all (with blank lines between them) before parsing.

So for the parsing done by pandoc, it sees it as one document... so no wonder that cross-file links won't work.

Upvotes: 2

Related Questions