George2
George2

Reputation: 45801

convert html into MSWord

I am looking for sample code which convert html into MSWord. C# code appreciated. Html input is as a string whose content is html document, I want to learn how to use .Net Word (Office) SDK to do the conversion.

thanks in advance, George

Upvotes: 2

Views: 2071

Answers (2)

Alan Plum
Alan Plum

Reputation: 10902

Do remember that MSIE's (or Word's, for that matter -- AFAIK MS Office still uses its own) rendering engine isn't as reliable as you might want it to be, so anything beyond simple formatting might appear different in a Word document than in your browser.

Alsoplustoo, Word's interpretation of the DOC format might differ from your converter's -- OO.o had that problem for ages.

Upvotes: 0

Bessi
Bessi

Reputation: 762

It depends a lot on the nature of the html document you are trying to convert. One simple way is just to use the Word automation to open an .html document and then save it as a .doc document.

        object readOnly = false;
        object isVisible = true;
        object missing = System.Reflection.Missing.Value; // Values we don't care about
        object fileName = "C:/webpage.htm";
        object newFileName = "C:/webpage.doc";       

        Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();

        // word.Visible = true; // To see what's happening

        Microsoft.Office.Interop.Word.Document document = word.Documents.Open(ref fileName, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);

        document.Activate();

        object saveFormat = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatDocument;

        document.SaveAs(ref newFileName, ref saveFormat, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);

        document.Close(ref missing, ref missing, ref missing);

Note

  • You have to add a reference to Microsoft.Office.Interop.Word or something similar
  • The number of ref missing arguments depends on wich version of Word you are using
  • You have to use full paths in the filename as the Word instance starts from the System folder.

Upvotes: 4

Related Questions