scanny
scanny

Reputation: 28863

How can I determine type and size of common image types in Python?

I'm working on a set of libraries for processing Microsoft Office Open XML documents. In the course of embedding pictures in Word and PowerPoint documents, I need to determine the MIME type of the image and a few header details like pixel dimensions, dpi would be nice too.

Currently I'm using Pillow to do this, but as dependencies go it's less than ideal. I only use a couple statements against the library, but the dependency requires that folks have a C compiler and image libraries like libjpeg installed. This makes the install especially challenging on Windows, although even on OS X it more involved than I'd like.

Is there a way that I can get just the basics with a pure Python imaging library or perhaps just merge a reasonably simple module with my distribution?

Upvotes: 1

Views: 690

Answers (1)

ThiefMaster
ThiefMaster

Reputation: 318468

First of all, using Pillow is probably best solution, especially since you can download windows binaries from pypi.

A quick google search resulted in this pure-python function to get the size of GIF, PNG and JPEG images:

import struct
from cStringIO import StringIO


def get_image_info(data):
    """
    Return (content_type, width, height) for a given img file content
    no requirements
    """
    data = str(data)
    size = len(data)
    height = -1
    width = -1
    content_type = ''

    # handle GIFs
    if (size >= 10) and data[:6] in ('GIF87a', 'GIF89a'):
        # Check to see if content_type is correct
        content_type = 'image/gif'
        w, h = struct.unpack("<HH", data[6:10])
        width = int(w)
        height = int(h)

    # See PNG 2. Edition spec (http://www.w3.org/TR/PNG/)
    # Bytes 0-7 are below, 4-byte chunk length, then 'IHDR'
    # and finally the 4-byte width, height
    elif ((size >= 24) and data.startswith('\211PNG\r\n\032\n')
          and (data[12:16] == 'IHDR')):
        content_type = 'image/png'
        w, h = struct.unpack(">LL", data[16:24])
        width = int(w)
        height = int(h)

    # Maybe this is for an older PNG version.
    elif (size >= 16) and data.startswith('\211PNG\r\n\032\n'):
        # Check to see if we have the right content type
        content_type = 'image/png'
        w, h = struct.unpack(">LL", data[8:16])
        width = int(w)
        height = int(h)

    # handle JPEGs
    elif (size >= 2) and data.startswith('\377\330'):
        content_type = 'image/jpeg'
        jpeg = StringIO(data)
        jpeg.read(2)
        b = jpeg.read(1)
        try:
            while (b and ord(b) != 0xDA):
                while (ord(b) != 0xFF): b = jpeg.read
                while (ord(b) == 0xFF): b = jpeg.read(1)
                if (ord(b) >= 0xC0 and ord(b) <= 0xC3):
                    jpeg.read(3)
                    h, w = struct.unpack(">HH", jpeg.read(4))
                    break
                else:
                    jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0])-2)
                b = jpeg.read(1)
            width = int(w)
            height = int(h)
        except struct.error:
            pass
        except ValueError:
            pass

    return content_type, width, height

Note that the code on that blog was written by Emmanuel VAÏSSE. There is no license specified on his blog so depending on where you want to include the code you might want to re-implement the function or ask him to be onthe safe site.

Upvotes: 3

Related Questions