Jkind9
Jkind9

Reputation: 740

Converting .TIF to .PDF gives PIL: Error reading image

I've been trying to batch process some .TIF files and convert them to PDFs. I did have it working, but then after trying to change img2pdf so it would accept larger files I was never able to get the same program running again, even after re-installing.

Currently this is throwing out the following error:

>>>>
ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x000001A608255EB8>

Here is the code I've been using. Anyone got any suggestions? Thanks in advance.


import img2pdf, sys, os, time
image_directory = r"PATH"

image_files = []

for root, dirs, files in os.walk(image_directory):
    for file in files:
        if file.endswith(".tif") or file.endswith(".TIF"):
             print("Discovered this TIF: ", os.path.join(root, file))
             image_files.append(os.path.join(root, file))

for image in image_files:
    output_file = image[:-4] + ".pdf"
    print ("Putting all TIFs into ", output_file)
    pdf_bytes = img2pdf.convert(image)
    file = open(output_file,"wb")
    file.write(pdf_bytes)

Here is the full traceback

Traceback (most recent call last):

  File "<ipython-input-37-fe96d5eeb049>", line 1, in <module>
    runfile('PATH', wdir='PATH')

  File "PATH", line 704, in runfile
    execfile(filename, namespace)

  File "PATH", line 108, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "PATH", line 23, in <module>
    pdf_bytes = img2pdf.convert(image_files)

  File "PATH", line 1829, in convert
    ) in read_images(rawdata, kwargs["colorspace"], kwargs["first_frame_only"]):

  File "PATH", line 1171, in read_images
    "PIL: error reading image: %s" % e

ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x000001A6082BE3B8>

Upvotes: 2

Views: 890

Answers (2)

Mark Setchell
Mark Setchell

Reputation: 207540

If, as I understand it, you want to recursively find all TIFF images and convert each one to a correspondingly named PDF file, you can do that simply and in parallel with GNU Parallel and ImageMagick like this in Terminal:

find . -iname "*tif" -print0 | parallel -0 --dry-run mogrify {} {.}.pdf

Sample Output

mogrify ./OpenCVTIFF64/result.tif ./OpenCVTIFF64/result.pdf
mogrify ./OpenCVTIFF64/a.tif ./OpenCVTIFF64/a.pdf
mogrify ./OpenCVBasics/a.tif ./OpenCVBasics/a.pdf
mogrify ./CImgDump/image.tif ./CImgDump/image.pdf

That command says... "Starting in the current directory, recursively find all TIFF files, whether upper or lowercase or some mixture and pass their names, null-terminated, to GNU Parallel. It should then read each name and run ImageMagick mogrify to convert that TIFF into a file with same name but the extension replaced with PDF."

If it does what you want, remove the --dry-run and do it again for real.

Upvotes: 4

Jkind9
Jkind9

Reputation: 740

So this ended up working once I executed pip install 'Pillow>=6.0.0' --force-reinstall, even though the command itself didn't execute properly. I get a few warnings when I run, but it's now working. Short version is, it was an issue with Pillow.

Upvotes: 0

Related Questions