Reputation: 740
I've been trying to batch process some .TIF files and convert them to PDFs. I did have it working, but then after trying to change img2pdf so it would accept larger files I was never able to get the same program running again, even after re-installing.
Currently this is throwing out the following error:
>>>>
ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x000001A608255EB8>
Here is the code I've been using. Anyone got any suggestions? Thanks in advance.
import img2pdf, sys, os, time
image_directory = r"PATH"
image_files = []
for root, dirs, files in os.walk(image_directory):
for file in files:
if file.endswith(".tif") or file.endswith(".TIF"):
print("Discovered this TIF: ", os.path.join(root, file))
image_files.append(os.path.join(root, file))
for image in image_files:
output_file = image[:-4] + ".pdf"
print ("Putting all TIFs into ", output_file)
pdf_bytes = img2pdf.convert(image)
file = open(output_file,"wb")
file.write(pdf_bytes)
Here is the full traceback
Traceback (most recent call last):
File "<ipython-input-37-fe96d5eeb049>", line 1, in <module>
runfile('PATH', wdir='PATH')
File "PATH", line 704, in runfile
execfile(filename, namespace)
File "PATH", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "PATH", line 23, in <module>
pdf_bytes = img2pdf.convert(image_files)
File "PATH", line 1829, in convert
) in read_images(rawdata, kwargs["colorspace"], kwargs["first_frame_only"]):
File "PATH", line 1171, in read_images
"PIL: error reading image: %s" % e
ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x000001A6082BE3B8>
Upvotes: 2
Views: 890
Reputation: 207540
If, as I understand it, you want to recursively find all TIFF images and convert each one to a correspondingly named PDF file, you can do that simply and in parallel with GNU Parallel and ImageMagick like this in Terminal:
find . -iname "*tif" -print0 | parallel -0 --dry-run mogrify {} {.}.pdf
Sample Output
mogrify ./OpenCVTIFF64/result.tif ./OpenCVTIFF64/result.pdf
mogrify ./OpenCVTIFF64/a.tif ./OpenCVTIFF64/a.pdf
mogrify ./OpenCVBasics/a.tif ./OpenCVBasics/a.pdf
mogrify ./CImgDump/image.tif ./CImgDump/image.pdf
That command says... "Starting in the current directory, recursively find all TIFF files, whether upper or lowercase or some mixture and pass their names, null-terminated, to GNU Parallel. It should then read each name and run ImageMagick mogrify
to convert that TIFF into a file with same name but the extension replaced with PDF
."
If it does what you want, remove the --dry-run
and do it again for real.
Upvotes: 4
Reputation: 740
So this ended up working once I executed pip install 'Pillow>=6.0.0' --force-reinstall, even though the command itself didn't execute properly. I get a few warnings when I run, but it's now working. Short version is, it was an issue with Pillow.
Upvotes: 0