Kylar
Kylar

Reputation: 9344

How to identify contents of a byte[] is a JPEG?

I have a small byte array (under 25K) that I receive and decode as part of a larger message envelope. Sometimes this is an image, furthermore it is a JPG. I have no context information other than the byte array, and need to identify both if it IS an image, and if the image is of type JPG.

Is there some magic number, or magic bytes that exist at the beginning, end or at some offset that I can look at to identify it?

An example of my code looks like this (from memory, not c/p):

byte[] messageBytesAfterDecode = retrieveBytesFromEnvelope();
if(null != messageBytesAfterDecode && messageBytesAfterDecode > 0){
    if(areTheseBytesAJpeg(messageBytesAfterDecode)){
        doSomethingWithAJpeg(messageBytesAfterDecode)
    }else{
        flagEnvelopeAsHavingBadContentInTheField();
    }
}

I really need what would go into the

areTheseBytesAJpeg(byte[] mBytes){}

method, or even a pointer to a spec that details it. I'm hoping there is a very quick way to make this determination, since I don't really want to read them into an Image, etc.

Upvotes: 37

Views: 61150

Answers (6)

RATHI
RATHI

Reputation: 5299

Some Extra info about other file format with jpeg: initial of file contains these bytes

BMP : 42 4D
JPG : FF D8 FF EO ( Starting 2 Byte will always be same)
PNG : 89 50 4E 47
GIF : 47 49 46 38

When a JPG file uses JFIF or EXIF, The signature is different :

Raw  : FF D8 FF DB  
JFIF : FF D8 FF E0  
EXIF : FF D8 FF E1

some code:

private static Boolean isJPEG(File filename) throws Exception {
    DataInputStream ins = new DataInputStream(new BufferedInputStream(new FileInputStream(filename)));
    try {
        if (ins.readInt() == 0xffd8ffe0) {
            return true;
        } else {
            return false;

        }
    } finally {
        ins.close();
    }
}

Upvotes: 25

zsalzbank
zsalzbank

Reputation: 9867

From wikipedia:

JPEG image files begin with FF D8 and end with FF D9.

http://en.wikipedia.org/wiki/Magic_number_(programming)

Upvotes: 63

Stephen C
Stephen C

Reputation: 719739

Another source of "knowledge" about magic numbers (including for JPEG files) is the magic file used by the GNU/Linux file command.

If you have the file command installed, then file --version will tell you where the magic file lives, and you can read it using a text editor ... and careful reading of man 5 magic.

(And the magic file contents confirm the details of other answers.)

Upvotes: 9

damg
damg

Reputation: 689

A lot of formats are identified by so-called magic numbers. These are byte sequences usually in the front of the file to identify whether the following binary data is really what you think it is. A quick google search returned: http://www.linfo.org/magic_number.html and specifically the citation:

"Similarly, a commonly used magic number for JPEG (Joint Photographic Experts Group) image files is 0x4A464946, which is the ASCII equivalent of JFIF (JPEG File Interchange Format). However, JPEG magic numbers are not the first bytes in the file; rather, they begin with the seventh byte. Additional examples include 0x4D546864 for MIDI (Musical Instrument Digital Interface) files and 0x425a6831415925 for bzip2 compressed files."

Upvotes: 4

user257111
user257111

Reputation:

Quoting this wikipedia article:

JPEG image files begin with FF D8 and end with FF D9. JPEG/JFIF files contain the ASCII code for "JFIF" (4A 46 49 46) as a null terminated string. JPEG/Exif files contain the ASCII code for "Exif" (45 78 69 66) also as a null terminated string, followed by more metadata about the file.

Upvotes: 6

Jonathan Wood
Jonathan Wood

Reputation: 67355

A JPG file does have a specific header that you could use to determine a very good likelihood that it is a JPG file. However, it's not clear if you will have the entire file in the byte array.

Anyway, here's specifics on the header: http://www.fastgraph.com/help/jpeg_header_format.html

Upvotes: 0

Related Questions