Javier Marín
Javier Marín

Reputation: 2096

Import doc and docx files in .Net and C#

I'm writing a text editor and I want to add the possibility to import .doc and .docx files. I know that I could use OLE Automation, but if I use a recent OLE library, it won't work with those people with an older version of Word, and if instead I use an older version, it won't be able to read .docx files. Any ideas? Thanks

EDIT: Another solution would be that, like my application works with HTML and RTF, convert .doc and .docx files with command line to one of these formats, something like this: http://www.snee.com/bobdc.blog/ 2007/09/using-word-for-command-line-co.html

Upvotes: 3

Views: 5520

Answers (3)

Javier Marín
Javier Marín

Reputation: 2096

It's works with the Office 2003 PIA, tested in my computer running Office 2010:

using System.IO;
using System.Reflection;
using Microsoft.Office.Interop.Word;

public string GetHtmlFromDoc(string path)
    var wordApp = new Application {Visible = false};

//Cargar documento
            object srcPath = path;
            var wordDoc = wordApp.Documents.Open(ref srcPath);

            //Guardarlo en HTML
            string destPath = Path.Combine(Path.GetTempPath(), "word" + (new Random().Next()) + ".html");
            if (wordDoc != null)
            {
                object oDestPath = destPath;
                object exportFormat = WdSaveFormat.wdFormatHTML;
                wordDoc.SaveAs(ref oDestPath, ref exportFormat);
            }

            //Cerrar
            wordDoc.Close();
            wordApp.Quit();

            //Comprobar que el archivo existe);
            if (File.Exists(destPath))
            {
               return File.ReadAllText(destPath, Encoding.Default);
}
return null;
}

Upvotes: 2

CesarGon
CesarGon

Reputation: 15325

Why don't you use the Office Primary Interop Assemblies (PIAs)?

I think you will have to decide which versions of Word you want to support. I suggest you settle on Word 2003 as the lowest. That will allow you to use the Office 2003 PIAs and program against them. Installing PIAs in a machine installs binding redirects as well, so they work with newer versions on Word. There should be no problem in opening .docx files with Word 2007 or 2010 through Office 2003 PIAs, although I haven't tried this myself.

Upvotes: 1

eckleman
eckleman

Reputation: 153

You should be able to use the OpenXML libraries or xpath in .NET to read / import the contents of a docx file.

Upvotes: 0

Related Questions