Googlebot
Googlebot

Reputation: 15673

Detecting file MIME in C

I have files with wrong extensions, and try to find the correct MIME in a C script.

For a PDF file with txt extension, magic (#include <magic.h>)

  const char *mime;
  magic_t magic;
  magic = magic_open(MAGIC_MIME_TYPE); 
  magic_load(magic, NULL);
  magic_compile(magic, NULL);
  mime = magic_file(magic, filename);
  printf("%s\n", mime);
  magic_close(magic);

returned

application/octet-stream

which is not very helpful.

GIO 2.0 (#include <gio/gio.h>)

  char *content_type = g_content_type_guess (file_name, NULL, 0, &is_certain);

  if (content_type != NULL)
    {
      char *mime_type = g_content_type_get_mime_type (content_type);

      g_print ("Content type for file '%s': %s (certain: %s)\n"
               "MIME type for content type: %s\n",
               file_name,
               content_type,
               is_certain ? "yes" : "no",
               mime_type);

      g_free (mime_type);
    }

returned

Content type for file 'test.txt': text/plain (certain: no)
MIME type for content type: text/plain

However, file command in Linux returns the correct MIME:

file test.txt
test.txt: PDF document, version 1.6

This should not be the expected behaviors of these well-established libraries in C. What do I do wrong?

Upvotes: 1

Views: 747

Answers (1)

raliscenda
raliscenda

Reputation: 466

It is true, that file utility is base on top of libmagic, but what really determines returned values is flags provided to libmagic_open (or appropriate set functions) and used database of MIME types.

Library provides means to use pre-compiled database and raw database (has to be compiled by calling libmagic_compile), which is your case. Documentation defines default dabase files when called using NULL parameter as a /usr/local/share/misc/magic for raw database (on debian directory link from /usr/share/misc/magic to ../file/magic/, and is empty) and magic.mgs in same parent directory.

Compiled library is by default placed into working directory and on my debian system seams to be empty (confirmed by default directory of database data being empty). After realizing this, I tried your example with magic_compile removed and it seams to improve things significantly.

Upvotes: 2

Related Questions