Alex
Alex

Reputation: 35831

Code to read Word docs

I need a script (or other code, C#, etc.) that will fetch every paragraph/sentence containing a particular word in a set of Word 2007 documents and move them to a new Word document, recording the filename of the original (source) document they were extracted from.

Upvotes: 1

Views: 1929

Answers (3)

Yahia
Yahia

Reputation: 70369

Office Interop is an option but beware: it is not supported by MS in server-like scenarios (like ASP.NET or Windows Service or similar) - see http://support.microsoft.com/default.aspx?scid=kb;EN-US;q257757#kb2 !

You will need to use some library to achieve what you want:

Upvotes: 1

CodeLikeBeaker
CodeLikeBeaker

Reputation: 21312

What about using a document indexer, such as dtSearch to index your documents (word, pdf, etc), and then tap into their API to do your unique searches that way. From what it sounds that might be the fastest way to accomplish this. Granted indexers like dtSearch cost money (not a whole lot), but sometimes it may be worth the cost compared to the hours you will spend trying to write your own code to do the same thing.

Some articles that I have found that might lead you in the right direction if you don't want to use an indexer are:

http://omegacoder.com/?p=555

and

http://weblogs.asp.net/guystarbuck/archive/2008/05/13/automated-search-and-replace-in-multiple-word-2007-documents-with-c.aspx

Edit To find a sentence that contains a specific word, you can try this link http://msdn.microsoft.com/en-us/library/bb546163.aspx

Upvotes: 1

Bryce Fischer
Bryce Fischer

Reputation: 5442

This might give you a start: http://msdn.microsoft.com/en-us/library/ff834910.aspx

Upvotes: 1

Related Questions