Reputation: 1837
I'm currently using my scanner to turn my PDFs into searchable PDFs. The OCR is already taken care of, since I can use ctrl-f within the PDF.
How can I get at the OCR'd content from my program though.
I'm open to using java, ruby, the question is kind of programming language agnostic. Is the OCR'd text openly accessible by reading the file?
Upvotes: 0
Views: 300
Reputation: 5707
Not sure how your OCR software creates the PDF, but could you use a third-party library (jPedal) or tool such as iText or XPDF to extract the text from the resulting PDF?
Upvotes: 1