Linas
Linas

Reputation: 4408

Check binary image data

I need to find what kind of file user have upload by checking binary data, and I found perfect solution for that, over here

Just to be specific this is the function I'm using:

function getImgType($filename) {
    $handle = @fopen($filename, 'r');
    if (!$handle)
        throw new Exception('File Open Error');

    $types = array('jpeg' => "\xFF\xD8\xFF", 'gif' => 'GIF', 'png' => "\x89\x50\x4e\x47\x0d\x0a", 'bmp' => 'BM', 'psd' => '8BPS', 'swf' => 'FWS');
    $bytes = fgets($handle, 8);
    $found = 'other';

    foreach ($types as $type => $header) {
        if (strpos($bytes, $header) === 0) {
            $found = $type;
            break;
        }
    }
    fclose($handle);
    return $found;
}

Now my question is, how can I get bits for other file types, like .zip, .exe, mp3, mp4 etc... if there is some kind of list somewhere out there it would be great, though I would like to extract it myself and learn how all of this really works.

Upvotes: 2

Views: 4487

Answers (3)

GolezTrol
GolezTrol

Reputation: 116120

Most files have a specific header or file signature or (apparently) magic number, which are different names for the same thing: a fixed set of bytes at the start of the file.

For instance, .exe starts with 'MZ', .zip has a fixed 4 byte sequence

This webpage contains a lot of file signatures: http://www.garykessler.net/library/file_sigs.html

If you search for .extension file format or .extension file header, you will usually find a description of the file format.

Upvotes: 2

Christian
Christian

Reputation: 28134

What you're looking for is called file magic number.

The magic number is a type of file signature - since sometimes it takes more than the magic number to identify the file.

A (very) short list of such numbers can be found here. A larger list can be found here.

File identification websites often times also mention the file magic number.

In linux, the file command can be used to identify files. In PHP you can use the FileInfo set of functions to identify files.


By the way, you did not specify the kind of files you want to identify. Sometimes, identification might be the wrong solution. For example, people used to want to identify files before passing them to GD or saving them on the server as images. In this case, identification is not really your job. Instead, use the following code:

$data = file_get_contents('data.dat'); // File might eventcontain a JPG...it is
                                       // still loaded without problems!
$image = imagecreatefromstring($data); // ... since this function just needs the
                                       // file's data, nothing more.

Upvotes: 4

Aziz
Aziz

Reputation: 20735

What you're looking for is called "File Signatures", "Magic Bytes", or "Magic Numbers".

This page lists a lot of them for many file formats

However, I wouldn't rely on them for identifying file formats. Use PHP's finfo_file instead.

Upvotes: 3

Related Questions