Gonçalo Vieira
Gonçalo Vieira

Reputation: 2249

mongoDB with non-standard UTF-8 characters

I got a problem when inserting unusual characters into MongoDB.

The characters are as follows: é è í ì á à... etc

basically, characters that are mostly used in latin countries.

when I insert it by script (since if I try it in the console it replies "non-utf-8 character") it does it just fine, well, it adds it to the collection, but with some weird characters.

if I do a find, with let's say, "Olá" and I have a field that gets searched with "Olá" but with those weird characters, it replies the data on the field I want just fine... however, if in that field there is some sort of character which isn't your standard english chartype, it displays something like this:

�til?

Is there any way that I can handle this?

I'm using Websphere Portal server, the PortletView file outputs in UTF-8, the servlet handles the inputted data to encode it in UTF-8 when sending to the db for the query, and it also handles the data it retrieves from the DB into UTF-8 (just to be sure).

Thanks in advance.

Upvotes: 2

Views: 7141

Answers (1)

Paul Grime
Paul Grime

Reputation: 15104

Check that a servlet filter isn't causing the request to use the incorrect character encoding. This can be caused by calling one of the getParameterXXX() family of methods on the ServletRequest before setting the request's character encoding to UTF-8.

The servlet spec states that ISO8859-1 is used by default. See SRV.3.9 Request data encoding.

Also make sure that the response uses the correct content type (with encoding). As posted above in comment:

String contentType= "text/html;charset=UTF-8";
response.setContentType(contentType);

Upvotes: 2

Related Questions