Reputation: 83
I'm trying to write a python function to parse the width and height from a jpeg file. The code I currently have looks like this
import struct
image = open('images/image.jpg','rb')
image.seek(199)
#reverse hex to deal with endianness...
hex = image.read(2)[::-1]+image.read(2)[::-1]
print(struct.unpack('HH',hex))
image.close()
There are a couple of problems with this though, firstly I need to look through the file to work out where to read from (after ff c0 00 11 08), and secondly I need to avoid picking up data from embedded thumbnails. Any suggestions?
Upvotes: 5
Views: 4745
Reputation: 31968
Further modernized, simplified, pep8d code from Acorn and manafire. There are further improvements that could be done, for example to be more efficient use a large block size, but good enough for a quick example:
import sys
from struct import unpack
with open(sys.argv[1], 'rb') as jpegfile:
if jpegfile.read(2) == b'\xff\xd8':
byte = jpegfile.read(1)
h = w = -1
while byte != b'':
# skip early segments
while byte != b'\xff':
byte = jpegfile.read(1)
while byte == b'\xff':
byte = jpegfile.read(1)
# read dimensions
if byte >= b'\xC0' and byte <= b'\xC3':
jpegfile.read(3)
h, w = unpack('>HH', jpegfile.read(4))
break
else:
size = int(unpack('>H', jpegfile.read(2))[0])
jpegfile.read(size - 2)
byte = jpegfile.read(1)
print(f'Width: {w}, Height: {h}')
else:
print('Not a JPG!')
Upvotes: 0
Reputation: 6084
I couldn't get any of the solutions to work in Python3 because of the changes to bytes and strings. Building on Acorn's solution, I came up with this, which works for me in Python3:
import struct
import io
height = -1
width = -1
dafile = open('test.jpg', 'rb')
jpeg = io.BytesIO(dafile.read())
try:
type_check = jpeg.read(2)
if type_check != b'\xff\xd8':
print("Not a JPG")
else:
byte = jpeg.read(1)
while byte != b"":
while byte != b'\xff': byte = jpeg.read(1)
while byte == b'\xff': byte = jpeg.read(1)
if (byte >= b'\xC0' and byte <= b'\xC3'):
jpeg.read(3)
h, w = struct.unpack('>HH', jpeg.read(4))
break
else:
jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0])-2)
byte = jpeg.read(1)
width = int(w)
height = int(h)
print("Width: %s, Height: %s" % (width, height))
finally:
jpeg.close()
Upvotes: 4
Reputation: 96081
My suggestion: use PIL (the Python Imaging Library).
>>> import Image
>>> img= Image.open("test.jpg")
>>> print img.size
(256, 256)
Otherwise, use Hachoir which is a pure Python library; especially hachoir-metadata seems to have the functionality you want).
Upvotes: 1
Reputation: 50597
The JPEG section of this function might be useful: http://code.google.com/p/bfg-pages/source/browse/trunk/pages/getimageinfo.py
jpeg.read(2)
b = jpeg.read(1)
try:
while (b and ord(b) != 0xDA):
while (ord(b) != 0xFF): b = jpeg.read(1)
while (ord(b) == 0xFF): b = jpeg.read(1)
if (ord(b) >= 0xC0 and ord(b) <= 0xC3):
jpeg.read(3)
h, w = struct.unpack(">HH", jpeg.read(4))
break
else:
jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0])-2)
b = jpeg.read(1)
width = int(w)
height = int(h)
except struct.error:
pass
except ValueError:
pass
Upvotes: 3