Israel Perales
Israel Perales

Reputation: 2350

Why accented letters are not interpreted correctly in express.js?

I have the following problem in express, I get a string in a post request containing the following characters 'a e i o u', the string is interpreted correctly, but adding a percent sign ('% á é í ó ú') letters with accent become question marks ('�'), any ideas?

This is my -package.json-: https://gist.github.com/ripper2hl/f05fd6de3b2b218e6d17

This is where the -index.js- receipt request: https://gist.github.com/ripper2hl/ae6533e14078bc9b0119

iojs v2.2.1enter image description here

Upvotes: 1

Views: 2926

Answers (1)

robertklep
robertklep

Reputation: 203304

In your sample project, you tell body-parser not to use the extended query string parser (here).

This will make it use the built-in querystring module for parsing query strings, which is less robust than the one used as the extended parser (qs).

You can see the differences between these two parsers with this PoC:

var querystring = require('querystring');
var qs          = require('qs');
var input       = 'data=% á é í ó ú';

console.log('querystring:', querystring.parse(input) );
console.log('qs         :', qs.parse(input) );

The output reproduces what you're seeing:

querystring: { data: '% � � � � �' }
qs         : { data: '% á é í ó ú' }

Ultimately, it boils down to your input, which is invalid in terms of URL-encoding:

  • % has special meaning (as an escape character)
  • spaces should be encoded
  • non-ASCII characters should be encoded

The valid input data looks like this:

data=%25%20%C3%A1%20%C3%A9%20%C3%AD%20%C3%B3%20%C3%BA

Upvotes: 2

Related Questions