KarkMump
KarkMump

Reputation: 81

Finding text coordinates using bytescout PDFExtractor C#

I have a PDF that I need to find and replace some text. I know how to create overlays and add text but I can't determine how to locate the current text coordinates. This is the example I found on the bytescout site -

        // Create Bytescout.PDFExtractor.TextExtractor instance
        TextExtractor extractor = new TextExtractor();
        extractor.RegistrationName = "";
        extractor.RegistrationKey = "";

        /////find text
        // Load sample PDF document
        extractor.LoadDocumentFromFile(@"myPdf.pdf");

        int pageCount = extractor.GetPageCount();
        RectangleF location;

        for (int i = 0; i < pageCount; i++)
        {
            // Search each page for string
            if (extractor.Find(i, "OPTION 2", false, out location))
            {
                do
                {
                    Console.WriteLine("Found on page " + i + " at location " + location.ToString());

                }
                while (extractor.FindNext(out location));
            }
        }
        Console.WriteLine();
        Console.WriteLine("Press any key to continue...");
        Console.ReadLine();
    }
} 

but it's not working because there isn't an overload Find method that takes 4 arguments. I'm not married to using Bytescout to find text coordinates off a pdf, but my company has a license. Is there a license free way to find text coordinates on a pdf if Bytescout can't accomplish what I'm trying to do?

Upvotes: 0

Views: 399

Answers (1)

Aron Lawrence
Aron Lawrence

Reputation: 153

Try Using:

extractor.Find(i, "OPTION 2", false).FoundText.Bounds

(source: https://cdn.bytescout.com/help/BytescoutPDFExtractorSDK/html/M_Bytescout_PDFExtractor_TextExtractor_Find.htm)

The FoundText property implements the ISearchResult: https://cdn.bytescout.com/help/BytescoutPDFExtractorSDK/html/T_Bytescout_PDFExtractor_ISearchResult.htm

which has these properties:

Public property Bounds: Bounding rectangle of all search result elements. Use Elements or GetElement(Int32) to get bounds of individual elements.

Public property ElementCount: Returns count of individual search result elements.

Public property Elements: Search result elements (individual text objects included into the search result) For COM/ActiveX use GetElement(Int32) instead.

Public property Height: Height of the bounding rectangle of search result. Use Elements or GetElement(Int32) to get bounds of individual elements.

Public property Left: Left coordinate of the bounding rectangle of search result. Use Elements or GetElement(Int32) to get bounds of individual elements.

Public property PageIndex: Index of the page containing the search result.

Public property Text: Text representation of the search result. Use Elements or GetElement(Int32) to get individual elements.

Public property Top: Top coordinate of the bounding rectangle of search result. Use Elements or GetElement(Int32) to get bounds of individual elements.

Public property Width: Width of the bounding rectangle of search result. Use Elements or GetElement(Int32) to get bounds of individual elements.

Upvotes: 2

Related Questions