Reputation: 238
I am using PDFSharp. I need help. I need to check wether the document contains the word "abc". Example:
11abcee = true
444abcggw = true
778ab = false
I wrote this code, but it does not work as expected:
PdfDocument document = PdfReader.Open("c:\\abc.pdf");
PdfDictionary dictionary = new PdfDictionary(document);
string a = dictionary.Elements.GetString("MTZ");
if (a.Equals("MTZ"))
{
MessageBox.Show("OK", "");
}
else
{
MessageBox.Show("NO", "");
}
Am I missing something?
Upvotes: 1
Views: 12129
Reputation: 2941
Old question, but here is an example.
Note: c# 7.0+ is required to use IS new local variable assignment.
Note: This example uses PDFSharp installed from Package Manager. "Install-Package PdfSharp -Version 1.50.5147"
Note: For my requirements, I only needed to search the first page of my PDFs, update if needed.
using (PdfDocument inputDocument = PdfReader.Open(filePath, PdfDocumentOpenMode.Import))
{
if (searchPDFPage(ContentReader.ReadContent(inputDocument.Pages[0]), searchText))
{
// match found.
}
}
This code looks for a cString that starts with a pound sign, the OP would need to use a Contains string function.
private bool searchPDFPage(CObject cObject, string searchText)
{
if (cObject is COperator cOperator)
{
if (cOperator.OpCode.Name == OpCodeName.Tj.ToString() ||
cOperator.OpCode.Name == OpCodeName.TJ.ToString())
{
foreach (var cOperand in cOperator.Operands)
{
if (searchPDFPage(cOperand, searchText))
{
return true;
}
}
}
}
else if (cObject is CSequence cSequence)
{
foreach (var element in cSequence)
{
if (searchPDFPage(element, searchText))
{
return true;
}
}
}
else if (cObject is CString cString)
{
if (cString.Value.StartsWith("#"))
{
if (cString.Value.Substring(2) == searchText)
{
return true;
}
}
}
return false;
}
Credit: This example was modified based on this answer: C# Extract text from PDF using PdfSharp
Upvotes: 2
Reputation: 7830
maybe this SO entry will help you: PDFSharp alter Text repositioning. It links to here - text extraction example with PDFSharp.
Upvotes: 1