chmanie
chmanie

Reputation: 5086

Verify mime type of uploaded files in node.js

I'm using node and express to handle file uploads and I'm streaming them directly to conversion services using multiparty/busboy and request.

Is there a way to verify that the streams have some certain filetypes before sending them to the corresponding providers? I tried https://github.com/mscdex/mmmagic to get the MIME type out of the first chunk(s) and it worked nicely. I was wondering if the following workflow might work somehow:

I tried to get this working but I seem to have some stream compatibility issues (node 0.8.x vs. node 0.10.x streams, which are not supported by the request library).

Are there any best-practices to solve this problem? Am I looking at it the wrong way?

EDIT: Thanks to Paul I came up with this code:

https://gist.github.com/chmanie/8520572

Upvotes: 4

Views: 12731

Answers (1)

Paul Mougel
Paul Mougel

Reputation: 17038

Besides of checking the Content-Type header of the client's request, I'm not aware of a better and more clever way to check MIME types.

You can implement the solution you described above using a Transform stream. In this example, the transform stream buffers some arbitrary amount of data, then sends it to your MIME checking library. If everything is fine, it re-emits data. The subsequent chunks will be emitted as-is.

var stream = require('readable-stream');
var mmm = require('mmmagic');
var mimeChecker = new stream.Transform();
mimeChecker.data = [];
mimeChecker.mimeFound = false;
mimeChecker._transform = function (chunk, encoding, done) {
  var self = this;

  if (self.mimeFound) {
    self.push(chunk);
    return done();
  }

  self.data.push(chunk);
  if (self.data.length < 10) {
    return done();
  }
  else if (self.data.length === 10) {
    var buffered = Buffer.concat(this.data);
    new mmm.Magic(mmm.MAGIC_MIME_TYPE).detect(buffered, function(err, result) {
      if (err) return self.emit('error', err);
      if (result !== 'text/plain') return self.emit('error', new Error('Wrong MIME'));
      self.data.map(self.push.bind(self));
      self.mimeFound = true;
      return done();
    });
  }
};

You can then pipe this transform stream to any other stream, like a request stream (which totally supports Node 0.10 stream by the way).

// Usage example
var fs = require('fs');
fs.createReadStream('input.txt').pipe(mimeChecker).pipe(fs.createWriteStream('output.txt'));

Edit: To be clearer on the incompatibility you encountered between Node 0.8 and 0.10 streams, when you define a stream and attach to it a .on('data') listener, it will switch into flow mode (aka 0.8 streams), which means that it will emit data even if the destination isn't listening. This is what could happen if you launch an asynchronous request to Magic.detect(): the data still flows, even if you listen for it.

Upvotes: 9

Related Questions