Reputation: 297
I have to convert a .pdf
file containing scanned images into .txt
files. The tesseract ocr
converts only images to .txt
, but I need to first extract the .tif
images and then convert it. Can anyone help me with this?
Upvotes: 13
Views: 19512
Reputation: 9402
Use Imagemagick:
convert -density 600 input.pdf output.tif
Density is in DPI, from my experience 600 DPI works the best.
Upvotes: 22