Reputation: 11
I am new to Swift and iOS. I am working on a project where I need to extract text from a PDF. I know about the PDFKit
framework, however, I get memory issues because I want to loop through the pages.
For that reason I found a library called PDFParser
which almost solves most of my problem. But sometimes, when there is a complex PDF, it doesn't work well and gives me the wrong result.
I created a simple function that extracts whole text from the page using the Parser function:
extension SimpleDocumentIndexer {
public func extractWholeTextFromPage(pageNumber: Int) -> String {
guard let pageIndex = pageIndexes[pageNumber] else {
return ""
}
var wholeText = ""
for textBlock in pageIndex.textBlocks {
wholeText.append(textBlock.chars)
}
return wholeText
}
}
What I want to achieve is:
I tried various approaches including PDFKit's PDFPage.string functionality and finding the text, but it throws memory issues. Other parsing libraries.
P.S.: I don't want to go with any paid library because all I want is an offline solution that can be done on the user's device. As well as I know the pdfdocument.findString method, but I want a more specific approach that I mentioned above.
Upvotes: 1
Views: 143