Kais Dkhili
Kais Dkhili

Reputation: 429

Convert HTML to Word document with python-docx?

I want to generate Word document from HTML field( a field that you can write into a text and set it Bold, Italic, Font color size,..). i used for this Python-docx to generate the WOrd document .Everythink is Ok (Adding picture, Text,..)the only problem is the style. The problem is i have the content in the word document but without style.

i tried to save the content like a HTML file and after that to create a python-docx file like the following:

html_f=open('f_html.html','w') 
html_f.write(u''+contenu) 
html_f.close() 


doc2=docx.Document('f_html.docx')

But i dont't have a result and Document() haven't find the file. Any help please

Upvotes: 2

Views: 11348

Answers (2)

Synthaze
Synthaze

Reputation: 6080

Alternatively:

from htmldocx import HtmlToDocx

new_parser = HtmlToDocx()
new_parser.parse_html_file("html_filename", "docx_filename")
#Files extensions not needed, but tolerated

Upvotes: 3

bkaf
bkaf

Reputation: 486

Python-docx only accepts plain text. You can use pywin32 extensions for windows to convert your html file. A simple example i found:

import win32com.client

word = win32com.client.Dispatch('Word.Application')
doc = word.Documents.Add('example.html')
doc.SaveAs('example.doc', FileFormat=0)
doc.Close()
word.Quit() 

Upvotes: 5

Related Questions