borghe
borghe

Reputation: 31

How does CouchDB handle UTF-8?

I'm quite puzzled by CouchDB: if I send a PUT request with some JSON string fields encoded as UTF-8, the non 7 bit ASCII characters get converted to the "\uXXXX" escape sequence. Is there any way to tell it not to escape UNICODE?

Upvotes: 3

Views: 2719

Answers (2)

senotrusov
senotrusov

Reputation: 809

CouchDB use mochiweb to handle JSON encoding/decoding.

There is an argument do encoding routine witch tells to output without those \uXXXX.

Simple way to apply patch is:

  1. get CouchDB source
  2. edit src/mochiweb/mochijson2.erl
  3. Find -record(encoder, {handler=null, utf8=false}). around line 45.
  4. Change to utf8=true
  5. make clean; make; make install

I found the discussion with Chris Anderson http://erlangine.feautec.pp.ru/?p=232 and it tells me there is a chance to get this behavior out of box if someone care to make proper patch to CouchDB.

Upvotes: 0

Pascal MARTIN
Pascal MARTIN

Reputation: 401002

Those \uXXXX are the correct way of encoding UTF-8 characters in Javascript.

Considering CouchDB is accessed using JSON (i.e. Javascript data), those sequences should be interepreted when using the data, and this should not be a problem.

Upvotes: 5

Related Questions