Reputation: 53
I have several html-formatted URLs in my bookdown .Rmd files that disappear in the generated PDF. It appears that the link is being ignored and the PDF only displays the text that should connect the link.
For example, <a href="https://www.cygwin.com" target="_blank">Cygwin</a>
simply appears as Cygwin (no hyperlink).
But when the website matches the displayed text, then it works fine (e.g.: <a href="https://www.cygwin.com" target="_blank">https://www.cygwin.com</a>
), presumably because the text is the link itself.
Is there a way to have bookdown preserve these html hyperlinks in the PDF output?
I am running the following to generate the PDF in R Studio:
render_book("index.Rmd", "bookdown::pdf_book")
And the top of index.Rmd looks like this:
title: "My Title"
site: bookdown::bookdown_site
documentclass: book
link-citations: yes
output:
bookdown::pdf_book:
pandoc_args: [--wrap=none]
urlcolor: blue
Upvotes: 5
Views: 895
Reputation: 22544
Pandoc, and in extension R Markdown, just keeps the raw HTML of the links around. The raw HTML chunks are output to formats supporting HTML (like epub), but not for LaTeX (which is used to generate the PDF). Pandoc will just parse the link's content, which is the reason why it seems to work if your link text is a URL.
The simplest solution would of course be to use Markdown syntax for links instead, which is just as expressive as HTML: [Cygwin](https://www.cygwin.com){target="_blank"}
. However, if that is not an option, then things get a bit hacky.
Here's a method to still parse those links. It uses a Lua filter to convert the raw HTML into a proper link. Just safe the following script as parse-html-links.lua
into the same directory as your Rmd file and add '--lua-filter=parse-html-links.lua'
to your list of pandoc_args
.
local elements_in_link = {}
local link_start
local link_end
Inline = function (el)
if el.t == 'RawInline' and el.format:match'html.*' then
if el.text:match'<a ' then
link_start = el.text
return {}
end
if el.text:match'</a' then
link_end = el.text
local link = pandoc.read(link_start .. link_end, 'html').blocks[1].content[1]
link.content = elements_in_link
-- reset
elements_in_link, link_start, link_end = {}, nil, nil
return link
end
end
-- collect link content
if link_start then
table.insert(elements_in_link, el)
return {}
end
-- keep original element
return nil
end
Upvotes: 2