Reputation: 9040
We need to import OR copy and paste word documents and convert them to HTML ready data.
Here's my thoughts:
file_get_contents
nl2br
However, it does not account for bold and other text formatting.
Also, there are several microsoft characters that we shouldn't require.
What is a good strategy for word imports into beautiful HTML?
Upvotes: 0
Views: 1527
Reputation: 31300
I wouldn't try to tackle all of this on your own. word2cleanhtml.com looks like it will suit your needs and may have an API offering soon.
However, it appears that you can use Word itself from the command line to convert your document for you. This will, of course, require that MS Word is installed on your PHP server.
shell_exec("C:/Program Files/Microsoft Office/Office12/WINWORD.EXE /msaveashtml C:/path/to/your.doc");
The above code uses the macro defined in this answer to a similar question. You will need to copy the the saveashtml macro from that answer and add it to Word.
Upvotes: 3