Tw1tCh
Tw1tCh

Reputation: 89

PDF to HTML or similar

I'm building an application to view pdf's through a browser without the need of a plugin on mobile devices. I tried ImageMagick and ghostscript to covert the pages to images but they are far too large and text becomes unclear. I see website offering a service of converting pdf's into html and do a descent job but I can't find an example of how this is accomplished. Any help is much appreciated. Thanks!

Upvotes: 2

Views: 366

Answers (3)

FFL
FFL

Reputation: 669

I was googling and came across the below link explaining how scridb.com implements conversion. http://coding.scribd.com/2010/06/01/the-perils-of-stacking/

Upvotes: 1

FFL
FFL

Reputation: 669

If you are looking at converting PDF to HTML and planning to run the conversion on a server, then you can try pdf2html. It is a program packaged as part of poppler-utils. I do not know how the program accomplishes it.

Upvotes: 1

CuddleBunny
CuddleBunny

Reputation: 1981

EDIT: I seem to have read the question backwards. In this case it might be best to parse through the PDF and then format some HTML based on what you find. I believe the javapdf option is capable of this, but I haven't used any of these so I am not sure. If worse comes to worst and you can't find software to disassemble a PDF, you might be able to write your own disassembler in Java or PHP by reading the PDF specification. Best of luck!

http://www.adobe.com/devnet/pdf/pdf_reference.html - PDF Specification (Adobe Modified Version, because they are most popular you may want to support their extensions)

-- OLD -- These websites probably write their own proprietary software to do the trick. If you are truly interested in this undertaking, I would suggest parsing the HTML to get the data and style information and using it to format some sort of PDF writer APIs. A quick Google search yields the following: -- END OLD --

http://www.cutepdf.com/Solutions/

http://ruby-pdf.rubyforge.org/pdf-writer/doc/index.html

http://asprise.com/product/javapdf/

Upvotes: 1

Related Questions