rado
rado

Reputation: 105

How to convert value with unicode in Json request into simple characters?

Sometime client send Json-RPC request with Json value as unicorde symboles. Example:

{ "jsonrpc": "2.0", "method": "add", "params": { "fields": [ { "id": 1, "val": "\u0414\u0435\u043d\u0438\u0441" }, { "id": 2, "val": "\u041c\u043e\u044f" } ] }, "id": "564b0f7d-868a-4ff0-9703-17e4f768699d" }

How do I processing Json-RPC request:

  1. My server get the request like byte[];
  2. Convert it to io.vertx.core.json.JsonObject;
  3. Make some manipulations;
  4. Save to DB;

And I found in DB records like:

"val": "\u0414\u0435\u043d\u0438\u0441"

And the worst in this story. If client try to search this data, he'll get:

"val": "\\u0414\\u0435\\u043d\\u0438\\u0441"

So I think, that I need to convert request data before deserialization to JsonObject. I tried and it didn't help:

String json = new String(incomingJsonBytes, StandardCharsets.UTF_8);
return json.getBytes(StandardCharsets.UTF_8);

Also I tried to use StandardCharsets.US_ASCII.

Note: Variant with StringEscapeUtils.unescapeJava() I can not, because it unescape all necessary and unnecessary '\' symbols.

If anyone know how to solve it? Or library that already makes it? Thank a lot.

Upvotes: 2

Views: 1279

Answers (1)

Karol Dowbecki
Karol Dowbecki

Reputation: 44952

io.vertx.core.json.JsonObject depends on Jackson ObjectMapper to perform the actual JSON deserialization (e.g. io.vertx.core.json.Json has a ObjectMapper field). By default Jackson will convert \u0414\u0435\u043d\u0438\u0441 into Денис. You can verify this with a simple code snippet:

String json = "{ \"jsonrpc\": \"2.0\", \"method\": \"add\", \"params\": { \"fields\": [ { \"id\": 1, \"val\": \"\\u0414\\u0435\\u043d\\u0438\\u0441\" }, { \"id\": 2, \"val\": \"\\u041c\\u043e\\u044f\" } ] }, \"id\": \"564b0f7d-868a-4ff0-9703-17e4f768699d\" }";
ObjectMapper mapper = new ObjectMapper();
Map map = mapper.readValue(json, Map.class);
System.out.println(map); // {jsonrpc=2.0, method=add, params={fields=[{id=1, val=Денис}, {id=2, val=Моя}]}, id=564b0f7d-868a-4ff0-9703-17e4f768699d}

Most likely the client is sending something else because your example value is deserialized correctly. Perhaps it's doubly escaped \\u0414\\u0435\\u043d\\u0438\\u0441 value which Jackson will convert to \u0414\u0435\u043d\u0438\u0441 removing one layer of escaping?

There is no magic solution for this. Either write your own Jackson deserialization configuration or make the client stop sending garbage.

Upvotes: 3

Related Questions