Reputation: 21
Hi I was trying to convert nonreadable pdf to jpegs using the following code:
import cv2
import pytesseract
import re
import os
from wand.image import Image
from PIL import Image as PI
from pyocr import pyocr
from pyocr import builders
import io
from pyocr import tesseract as tool
req_image = []
final_text = []
os.chdir("E:\\NonReadablePath")
os.getcwd()
with Image(filename='E:\\NonReadablePath\\2563989.pdf') as img:
print('pages = ', len(img.sequence))
with img.convert('png') as converted:
converted.save(filename='pyout/page.png')
I am facing this error: DelegateError: PDFDelegateFailed `The system cannot find the file specified. ' @ error/pdf.c/ReadPDFImage/800 triggering on the line "with Image(filename='E:\NonReadablePath\2563989.pdf') as img:"
I have using python 3.6 on windows 10 using anaconda 4.4.1 I have also installed ImageMagick and Ghostscript and set the envirnoment variable MAGICK_HOME for both of the above mentioned tools.
Any help would be appreciated.
Upvotes: 2
Views: 475
Reputation: 11
i'm new here so forgive my format. I had the same problem and there didn't seem to be a good solution online, if you want to convert from pdf to jpg i found this free api online called convertapi its pretty easy to use, the only draw back is that you have a limited number of free conversion time.
Here's the code for convertapi
import convertapi
filename = 'pdf_name_without.pdf'
convertapi.api_secret = your_secret key
convertapi.convert('jpg', {'File': filename + '.pdf'},
from_format='pdf').save_files(filename+'_images')
convertapi can be installed using the pip command and the secret key would be provided as soon as you create an account with convertapi. I hope this helps someone and saves the trouble of spending hours trying to debug. Cheers
Upvotes: 1