Sylvain Page
Sylvain Page

Reputation: 633

Python Wand convert pdf to black image

I'm trying to convert some pdf files to jpg through Wand in Python:

from wand.image import Image as Img
from wand.color import Color

    def importPdf(self):
        filename, _ = QtWidgets.QFileDialog.getOpenFileName(self, "Open File",
                                                            QtCore.QDir.currentPath())
        print(filename)
        if not filename:
            print('error')
            return
        with Img(filename=filename,format='jpeg', resolution=300) as image:
            image.compression_quality = 99
            image.save(filename='file.jpeg')
            self.open_picture()

My problem is that it results is a black screeen. The conversion works fine with png, but I cannot perform the OCR (via tesseract on the png). I think it comes from a kind of transparent layer, but I have not found the way to remove it, though I did several things such as

image.alpha_channel = False # made the same with True
image.background_color = Color('White')

before saving the file. I'm using Imagemagick V6.9, because V7 fails with Wand.

Upvotes: 3

Views: 3302

Answers (2)

Martin
Martin

Reputation: 760

I had the same problem and fixed it, check my answer here: https://stackoverflow.com/a/46612049/2686243

Adding

image.background_color = Color("white")
image.alpha_channel = 'remove'

solved the issue.

Upvotes: 5

Sylvain Page
Sylvain Page

Reputation: 633

Because I did not find -flatten via wand api, I finally did it via os.system + convert.exe of imagemagick. It does the job.

cmd = "convert -units PixelsPerInch -density 300 -background white -flatten " + filename + " converted_pdf.jpg"
        print(cmd)
        os.system(cmd)

Upvotes: 0

Related Questions