ensnare
ensnare

Reputation: 42023

Extract embedded JPG from a RAW file w/ Python's MagickWand

I'd like to create thumbnails for RAW files, but processing the RAW file directly is very slow. I'd like to try to process the thumbnails from the embedded JPG first, and only render the RAW file as a last resort. How can I extract the embedded JPG using Wand?

Upvotes: 0

Views: 2566

Answers (2)

user1087001
user1087001

Reputation:

If you are reading Canon CR2 files, I've written a library called rawphoto which you can use (it's also on pypi).

To extract a thumbnail, you could do something like:

from rawphoto.raw import Raw
from wand.image import Image

with Raw(filename="example.CR2") as rawfile:
    blob = rawfile.fhandle.get_thumbnail()
// Do something with your JPEG thumbnail, eg. convert it to a PNG with Wand:
with Image(blob=blob) as image:
    image.format = 'png'
    image.save(filename='someimage.png')

I may add support for other raw formats later. Pull requests welcome.

Upvotes: 0

emcconville
emcconville

Reputation: 24419

The RAW format varies between device and manufacture. Usually the complete image sensor data is stored as TIFF with a JPEG preview. Rather than loading the complete file into a tool, it may be quicker to find the spec, and extract the preview directly with file and struct. Here's an example with Fujifilm's spec.

 from struct import unpack

 fd = file('source.raf','rb')
 # Skip over header + directory
 # See manufacture specification
 offset  = 16 # Magic bytes
 offset += 12 # Version
 offset += 32 # Camera name
 offset += 24 # Directory start & meta
 fd.seek(offset, 0)
 jpeg_offest = unpack('i', fd.read(4)) # Read where JPEG data starts
 jpeg_length = unpack('i', fd.read(4)) # Read size of JPEG data
 fd.seek(jpeg_offset, 0)
 jpg_blob = fd.read(jpeg_length)

Now drop the blob into wand, or default back to RAW image

 from wand.image import Image

 if jpeg_blob:
     img = Image(blob=jpg_blob)
 else:
     img = Image(filename='source.raf')

This solution will work if all your RAW data has been generated with the same manufacture spec. Else, you would need to build out file profiles for each spec, and evaluate the magic bytes of each RAW file to determine where the JPEG data is located. A flow may look something like...

dot file

Upvotes: 0

Related Questions