RokiDGupta
RokiDGupta

Reputation: 381

Convert PDF to Image using Python

I am trying to convert a pdf file to image file for this in my ubuntu server i have installed:

  1. python2.7
  2. poppler-utils
  3. pdf2image==1.12.1

My code:

from pdf2image import convert_from_path, convert_from_bytes

images = convert_from_path("/home/user/pdf_file.pdf")

# OR

with open("/home/user/pdf_file.pdf") as pdf:
    images = convert_from_bytes(pdf.read())

OUTPUT

When I am using the function "convert_from_path"

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
    thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

When I am using the function "convert_from_bytes"

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 268, in convert_from_bytes
    paths_only=paths_only,
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
    thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

I have reinstalled all my utilities then i am facing these problems.

Upvotes: 5

Views: 12081

Answers (2)

gamesun
gamesun

Reputation: 229

I failed in python2 too, but succeeded in python3.

There's a same issue happened on an other library: TypeError: 'threadsafe_iter' object is not an iterator

As they said, it's a python 2 vs 3 issue, caused by next() function.
If modify __next__() -> next() in file/home/***/.local/lib/python2.7/site-packages/pdf2image/generators.py , it will run successful in py2.

BTW, i have create a new issue to pdf2image team.
TypeError: ThreadSafeGenerator object is not an iterator #133


Additional
pdf2image readme said it's a python (3.5+) module.
pdf2image v1.7.1 work on py27. try it by pip install pdf2image==1.7.1

Upvotes: 4

Mohit Chandel
Mohit Chandel

Reputation: 1916

If you want to convert PDF to image you can try Python Ghostscript package:

pip install ghostscript

import ghostscript
import locale

def pdf2jpeg(pdf_input_path, jpeg_output_path):
    args = ["pef2jpeg", # actual value doesn't matter
            "-dNOPAUSE",
            "-sDEVICE=jpeg",
            "-r144",
            "-sOutputFile=" + jpeg_output_path,
            pdf_input_path]

    encoding = locale.getpreferredencoding()
    args = [a.encode(encoding) for a in args]

    ghostscript.Ghostscript(*args)

pdf2jpeg(
    "...Fixate/ActiveState/pdf/a.pdf",
    "...Fixate/ActiveState/pdf/a.jpeg",
)

Upvotes: 5

Related Questions