Reputation: 855
I'd like to extract thumbnail image from jpegs, without any external library. I mean this is not too difficult, because I need to know where the thumbnail starts, and ends in the file, and simply cut it. I study many documentation ( ie.: http://www.media.mit.edu/pia/Research/deepview/exif.html ), and try to analyze jpegs, but not everything clear. I tried to track step by step the bytes, but in the deep I confused. Is there any good documentation, or readable source code to extract the info about thumbnail start and end position within a jpeg file?
Thank you!
Upvotes: 28
Views: 59853
Reputation: 16706
Exiftool is very capable of doing this quickly and easily:
exiftool -b -ThumbnailImage my_image.jpg > my_thumbnail.jpg
Upvotes: 36
Reputation: 8715
For most JPEG images created by phones or digital cameras, the thumbnail image (if present) is stored in the APP1 marker (FFE1). Inside this marker segment is a TIFF file containing the EXIF information for the main image and the optional thumbnail image stored as a JPEG compressed image. The TIFF file usually contains two "pages" where the first page is the EXIF info and the second page is the thumbnail stored in the "old" TIFF type 6 format. Type 6 format is when a JPEG file is just stored as-is inside of a TIFF wrapper. If you want the simplest possible code to extract the thumbnail as a JFIF, you will need to do the following steps:
Upvotes: 23
Reputation: 15752
There is a much simpler solution for this problem, but I don't know how reliable it is: Start reading the JPEG file from the third byte and search for FFD8 (start of JPEG image marker), then for FFD9 (end of JPEG image marker). Extract it and voila, that's your thumbnail.
A simple JavaScript implementation:
function getThumbnail(file, callback) {
if (file.type == "image/jpeg") {
var reader = new FileReader();
reader.onload = function (e) {
var array = new Uint8Array(e.target.result),
start, end;
for (var i = 2; i < array.length; i++) {
if (array[i] == 0xFF) {
if (!start) {
if (array[i + 1] == 0xD8) {
start = i;
}
} else {
if (array[i + 1] == 0xD9) {
end = i;
break;
}
}
}
}
if (start && end) {
callback(new Blob([array.subarray(start, end)], {type:"image/jpeg"}));
} else {
// TODO scale with canvas
}
}
reader.readAsArrayBuffer(file.slice(0, 50000));
} else if (file.type.indexOf("image/") === 0) {
// TODO scale with canvas
}
}
Upvotes: 8
Reputation: 3540
The wikipedia page on JFIF at http://en.wikipedia.org/wiki/JPEG_File_Interchange_Format gives a good description of the JPEG Header(the header contains the thumbnail as an uncompressed raster image). That should give you an idea of the layout and thus the code needed to extract the info.
Hexdump of an image header (little endian display):
sdk@AndroidDev:~$ head -c 48 stfu.jpg |hexdump
0000000 d8ff e0ff 1000 464a 4649 0100 0101 4800
0000010 4800 0000 e1ff 1600 7845 6669 0000 4d4d
0000020 2a00 0000 0800 0000 0000 0000 feff 1700
Image Magic (bytes 1,0), App0 Segment header Magic(bytes 3,2), Header Length (5,4) Header Type signature ("JFIF\0"||"JFXX\0")(bytes 6-10), Version (bytes 11,12) Density units (byte 13), X Density (bytes 15,14), Y Density (bytes 17,16), Thumbnail width (byte 19), Thumbnail height (byte 18), and finally rest up to "Header Length" is thumbnail data.
From the above example, you can see that the header length is 16 bytes (bytes 6,5) and version is 01.01 (bytes 12,13). Further, as Thumbnail Width and Thumbnail Height are both 0x00, the image doesn't contain a thumbnail.
Upvotes: -1