anas
anas

Reputation:

identify image file format

Given image file with no extension, how can I read the image file and identify the file format in C++?

Upvotes: 1

Views: 3450

Answers (5)

NCP
NCP

Reputation: 59

I found a simple way to figur out the magic-bytes of a special file-type:

  1. I need some files of the same type e.g. BMP
  2. I read out the first 128 Bytes of each file and save it in a grid
  3. I found which of the bytes (columns) are equal and these bytes and there positions are the obvious "magic-bytes"

example

file1.bmp : 13 14 16 17 00 00 00 ... file2.bmp : 13 15 16 17 01 00 02 ... file3.bmp : 13 16 14 17 02 00 10 ...

magicbytes: 13 -- -- 17 -- 00 -- ...

But there is no guarantee that other files of this type has these magic bytes therefore I use a probability range : if the file has 80 % of my magic bytes reached than it is probably a file of this type.

This grid-data can be stored in a file or so and the filetype-analyser can "learn" to figure out on each analysi

Upvotes: 0

Sam Brightman
Sam Brightman

Reputation: 2950

Studying the source of the file command may be useful but most of the magic is done by... libmagic!

Upvotes: 1

KPexEA
KPexEA

Reputation: 16778

By reading the first few bytes you can get a guess but would need to try fully parsing to be sure. Here is some code from one of my image loading object you can use for reference:

    if(Open()==true)
    {
        unsigned char testread[5];

        if(Read(&testread,(unsigned long)4)==4)
        {
            testread[4]=0;
            if(!strcmp((char *)testread,"GIF8"))
            {
                Close();
                LoadGIFImage(justsize);
            }
            else if(testread[0]==0xff && testread[1]==0xd8)
            {
                Close();
                LoadJPGImage(justsize);
            }
            else if(testread[0]==0x89 && testread[1]==0x50 && testread[2]==0x4e && testread[3]==0x47)
            {
                Close();
                LoadPNGImage(justsize);
            }
            else if(testread[0]==0x00 && testread[1]==0x00 && testread[2]==0x01 && testread[3]==0x00)
            {
                Close();
                LoadWINICOImage(justsize);
            }
            else if(testread[0]==0x42 && testread[1]==0x4d)
            {
                Close();
                LoadBMPImage(justsize);

Upvotes: 5

Andrejs Cainikovs
Andrejs Cainikovs

Reputation: 28434

Old school question ;) You need to check so called 'magic numbers' on that file. In other words, almost each binary file type has some constant code at the start of the file. First, you need hex viewer for that: www.hhdsoftware.com/Products/home/hex-editor-free.html

And then search here: www.garykessler.net/library/file_sigs.html

Upvotes: 1

Mehrdad Afshari
Mehrdad Afshari

Reputation: 421988

You can check the source of Linux file command (git://git.debian.net/git/debian/file.git). It does exactly the same thing; and not just for image files.

Upvotes: 6

Related Questions