Reputation: 61
Wenn converting markdown files with cross document links to html, docs or pdf the links get broken in the process. I use pandoc 1.19.1 and MikTex. This is my testcase:
File1: doc1.md
[link1](/doc2.md)
File2: doc2.md
[link2](/doc1.md)
The result in html with this call to pandoc: pandoc doc1.md doc2.md -o test.html looks like this:
<p><a href="/doc2.md">link1</a> <a href="/doc1.md">link2</a></p>
As pdf a link is created but it does not work. Exported as docx it looks the same.
I would have asumed that when multiple files are processed and concatenated into the same output file, then the result should contain page internal links like anchor links for html-output. But instead the link it created in the output file like it was in the input files. Even the original file extension .md is preserved in the created links. What am I doing wrong ?
My problem looks a bit like this: pandoc command line parameters for resolving internal links In the comments of this question the bug is said to be fixed by a pull request in May. But the bug still seems to exist. Greetings Georg
Upvotes: 6
Views: 6360
Reputation: 2564
I had a similar problem when trying to export a Gitlab wiki to PDF. There links between pages look like filename-of-page#anchor-name
and links within a page look like #anchor-name
. I wrote a (finicky and fragile) pandoc filter that solved that problem for me, who knows it's useful to others.
To explain my solution I'll have two test files, 101-first-page.md
:
# First page // Gitlab automatically creates an anchor here named #first-page
Some text.
## Another section // Gitlab automatically creates an anchor here named #another-section
A link to the [first section](#first-page)
and 102-second-page.md
:
# Second page // Gitlab automatically creates an anchor here named #second-page
Some text and [a link to the first page](101-first-page#first-page).
When concatenating them to render as one document in pandoc, links between pages break as anchors change. Below the concatenated file with the anchors in comments.
# First page // anchor=#first-page
Some text.
## Another section anchor=#another-section
A link to the [first section](#first-page)
# Second page // anchor=#second-page
Some text and [a link to the first page](101-first-page#first-page). // <-- this anchor no longer exists.
The link from the second to the first page breaks as the link target is incorrect.
By pre-processing all markdown files first individually via a pandoc filter, and then concatenating the resulting json files I was able to get all links working.
101-A file on the wiki.md
should have a first level one header named A file on the wiki
.The filter itself (together with the pandoc script) is available in this gist.
What it does is:
first-page
first-page-another-section
.#first-page-first-page
101-first-page#first-page
becomes #first-page-first-page
.After it has run every markdown file through this filter individually and converted them to json files, it concatenates the json's and converts that to a PDF.
Upvotes: 3
Reputation: 39488
As the pandoc README states:
If multiple input files are given, pandoc will concatenate them all (with blank lines between them) before parsing.
So for the parsing done by pandoc, it sees it as one document... so you'll have to construct your links in multiple files as if it they were all in one file, see also this answer for details.
Upvotes: 3