Tim Long
Tim Long

Reputation: 13798

How can I get Pandoc to preserve my reference links?

I'm using the Pandoc-format extension in Visual Studio Code to format my markdown documents.

I write my reference links in a similar style to Stack Overflow, like this:

This is a test file, created by Tim at [Tigra Astronomy][tigra].

This is another line.

[tigra]: http://tigra-astronomy.com "Tigra home page"

Pandoc really mangles these references. I've tried various combinations of options and command-line experiments, such as:

pandoc .\input.md --from markdown --to markdown-shortcut_reference_links --reference-links --reference-location=document

that command actually produces this output:

This is a test file, created by Tim at [Tigra Astronomy][].

This is another line.

[Tigra Astronomy]: http://tigra-astronomy.com "Tigra home page"

...which is pretty close but still not quite there. So is there a way I can persuade Pandoc to just keep my references 'as written'?

Upvotes: 4

Views: 2983

Answers (3)

abdullahselek
abdullahselek

Reputation: 8463

It has already been a while this question's been asked but just in case someone is still looking for an answer. Markdown files have already options to create links, you can find the details and pandoc command that I use below. The pandoc version that I use is 2.19.2 with pdflatex engine.

I did use double square brackets to imply it's a reference.

1. [[1]](https://en.wikipedia.org/wiki/Wikipedia:About)
2. [[Google]](https://www.google.com/)

Pandoc command

pandoc --variable classoption=twocolumn --variable papersize=a4paper YOUR_FILE.md -o YOUR_FILE.pdf --pdf-engine=/Library/TeX/texbin/pdflatex

or

pandoc --variable classoption=twocolumn --variable papersize=a4paper --wrap=preserve --reference-links YOUR_FILE.md -o YOUR_FILE.pdf --pdf-engine=/Library/TeX/texbin/pdflatex

Upvotes: 0

julienb
julienb

Reputation: 1

There is a native way to keep the link title in Pandoc markdown like this : I'm an inline-style link with title

[I'm an inline-style link with title](https://www.noproblemo.ca "No Problemo")

As you can see, the title is in parenthesis separates by a space from URL in quote mark.

Upvotes: 0

Waylan
Waylan

Reputation: 42607

So is there a way I can persuade Pandoc to just keep my references 'as written'?

In short no, as Pandoc does not retain that information.

With the command pandoc --from markdown --to native you can have Pandoc output the Abstract Syntax Tree (AST), which is Pandoc's internal native representation of the document. For your sample document, the AST looks like this (see also the online demo):

[Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "test",Space,Str "file,",Space,Str "created",Space,Str "by",Space,Str "Tim",Space,Str "at",Space,Link ("",[],[]) [Str "Tigra",Space,Str "Astronomy"] ("http://tigra-astronomy.com","Tigra home page"),Str "."]
,Para [Str "This",Space,Str "is",Space,Str "another",Space,Str "line."]]

As you can see, the link does not retain the "name" you assigned it. Therefore, when Pandoc converts the AST back into Markdown, it cannot use your "name" as it is not available. Initially I had thought that perhaps you could use a filter or custom writer to include your reference name, but that won't work either as the name is simply not available.

In the end, this should not be surprising. Pandoc does not promise to retain all meta-data included in documents. In fact, the documentation warns:

Because pandoc’s intermediate representation of a document is less expressive than many of the formats it converts between, one should not expect perfect conversions between every format and every other. Pandoc attempts to preserve the structural elements of a document, but not formatting details such as margin size. And some document elements, such as complex tables, may not fit into pandoc’s simple document model. While conversions from pandoc’s Markdown to all formats aspire to be perfect, conversions from formats more expressive than pandoc’s Markdown can be expected to be lossy.

Of course, the reference name is a Markdown specific feature. However, Markdown supports representing links in multiple ways and maintaining reference names is not necessary to maintain a valid Markdown document. Therefore reference names are one of the things that Pandoc looses.

But why does Pandoc need to run through the AST if it is just outputing the same format. Because that is how it is architected. See the API documentation for more information on that.

Upvotes: 1

Related Questions