Reputation: 105
I'm trying to split a PDF file into separate HTML files. I mean for each PDF page I want an HTML file. This is how I do it:
pdf2htmlEX --split-pages 1 LMS.pdf --page-filename lms%03.html
In the result I got an empty LMS.html
and other files: lms%031.html
, lms%032.html
. The problem is that those html files are not correctly formatted, no CSS style?
Upvotes: 1
Views: 1812
Reputation: 2354
Funny thing about that... I stumbled across your question while trying to solve an identical problem. I used the same command as yours, except without setting the --page-filename
parameter. Using your example, my pdf2htmlEX
call would be analogous to:
pdf2htmlEX --split-pages 1 LMS.pdf
Then I opened up the main HTML file in Chrome to find a bunch of blank pages. After searching around a bit, I opened up the same file in Firefox. It worked. Very strange. No errors reported in the console output. Of course, I didn't even think to look in the Chrome console output. When I did I found:
Uncaught NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'file:///...'.
Thank God for StackOverflow. I don't know why it works in Firefox, but if you're getting the errors reported by Chrome, you need to be running a web server.
The easiest and fastest way for me to do this was to change into the directory in which I converted the PDF and run:
python -m SimpleHTTPServer
By default, your page will be served up at http://localhost:8000
. Problem solved. Use whatever server suits you best.
Upvotes: 3