Reputation: 1317
I want to read the contents of following file types using C#:
Is there any common API in .Net for reading all file type contents?
Upvotes: 3
Views: 1294
Reputation: 4216
I've used Aspose before it's a very powerful product it's reasonably pricey so would only recommend it if your application also needs to create new word/pdf/rtf documents.
I agree with the other comments about just using System.IO for reading HTML files.
Upvotes: 1
Reputation: 171351
If you are going to full-text index the data, look into using Lucene, it can handle those file types.
Upvotes: 0
Reputation: 99684
There is no built in support for reading most of those file types. HTML is plain text so you can use the System.IO/StreamReader to read it, but you must parse it yourself.
There are third party components which will read these file types, but I am not sure if there is one all encompassing component.
For PDFs, I believe iTextSharp allows you to read.
For RTF/Word, You can use the Primary Interop Assemblies
Upvotes: 2