Reputation: 39
I am trying to convert an .htm
file from the SEC website to a .pdf
and have the internal links work. I am successfully converting to .pdf
using wkhtmltopdf
, but all the internal links point me back to the first page.
wkhtmltopdf https://www.sec.gov/Archives/edgar/data/1594617/000119312514117433/d640354ds1a.htm test.pdf
Upvotes: 1
Views: 1206
Reputation: 11157
It looks like there's an issue with wkhtmltopdf
dealing with anchor tags that have no content. There's a PR that was opened in 2017 to resolve it, but it remains open.
As it turns out, your document does indeed have empty anchor tags, so that's probably the root cause:
<A NAME="toc640354_15"></A>
I would suggest using chrome to produce the pdf, with its --headless
and --print-to-pdf
flags. From your chrome installation directory, do:
chrome.exe --headless --disable-gpu --print-to-pdf="C:\path\to\file.pdf" https://www.sec.gov/Archives/edgar/data/1594617/000119312514117433/d640354ds1a.htm
Make sure you specify an absolute path to the output file or it doesn't seem to work, for whatever reason. The command will immediately return without any output or indication of success. Give it a few seconds to retrieve, render and write the file.
I tested with your document, and the links work perfectly.
Upvotes: 2