dps123
dps123

Reputation: 1073

Read word document in C#

I want to read a word document in the server (both doc and docx). The server does not have office installed, therefore i can't use com objects and also no commercial softwares.

Is there a way that I can use office tools alone and read the word docs (2003 and 2007)

Upvotes: 2

Views: 4091

Answers (3)

Hazel Patton
Hazel Patton

Reputation: 61

Another free option for only .docx files is OpenXML SDK.

For both .doc and .docx files you can use free version of GemBox.Document if the files have relatively smaller size, otherwise you'll need their pro version.
You can open and read any Word format with it in the same way, for example:

var docxFile = DocumentModel.Load("Sample.docx");
var docFile = DocumentModel.Load("Sample.doc");
var rtfFile = DocumentModel.Load("Sample.rtf");

var docxText = docxFile.Content.ToString();
// ...

Upvotes: 0

MadBoy
MadBoy

Reputation: 11104

For .docx your free option is DocX. Very advanced and easy to use. For doc I've not seen free alternative.

Upvotes: 1

Samuel Neff
Samuel Neff

Reputation: 74939

Unfortunately there are no good free options for reading .doc and .docx files. Even commercial options are sparse at reasonable prices, but there are good extremely expensive options.

For reading .doc files the only free option I'm aware of is POI for Java which you can run in .NET using IKVM. However, Word support in an experimental branch of POI's SVN repository, so I don't know how well it works.

http://poi.apache.org/

http://www.ikvm.net/

If you just want the text out of the .doc file and don't care about formatting, you can use the IFilter Win32 interface through pinvoke.

For reading .docx files you can use Microsoft Office Open XML SDK. Don't let "SDK" fool you though, this is a very light abstraction over the dealing with the XML directly. It's almost as painful to use.

http://www.microsoft.com/downloads/en/details.aspx?FamilyId=C6E744E5-36E9-45F5-8D8C-331DF206E0D0&displaylang=en

Upvotes: 4

Related Questions