Reputation: 1989
I need to create a winform in Visual Studio 2010 using C# that searches a directory for pdf files and then searches for certain text within the pdfs. For example, the user may enter "John Smith" into the winform. The program needs to search all the pdfs in a given directory for the text "John Smith." I currently do not have Adobe Acrobat and likely will not be able to purchase it or any non-free add-ins. I've been told to look at Apache Solr and Ghostscript, but don't see how those could be used in a winform. I've search for this a lot and seen lots of suggestions, but can't find any simple examples or tutorials on how to set up a winform to search pdfs. Can someone provide me with some sample code on how to search pdfs in a winform?
Upvotes: 0
Views: 1055
Reputation: 11191
To search certian text in PDF you can use the ITextSharp library at http://sourceforge.net/projects/itextsharp/
Here is a quick example
var reader = new PdfReader(pdfPath);
StringWriter output = new StringWriter();
for (int i = 1; i <= reader.NumberOfPages; i++)
output.WriteLine(PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy()));
//now you can search for the text from outPut.ToString();
Upvotes: 3