Reputation: 11669
What is the difference between these two below strings? When I decode first string then it works fine and I can see diachritic characters showing up fine.
String val = "m%C3%B6torhead album";
String decodedVal = URLDecoder.decode(val, StandardCharsets.UTF_8);
But when I try to decode below string then I don't see diachritic characters working fine.
String val = "m%EF%BF%BDtorhead album";
String decodedVal = URLDecoder.decode(val, StandardCharsets.UTF_8);
Can anyone tell me what's wrong here? These strings we are getting from upstream so we don't have control on that.
Upvotes: 0
Views: 225
Reputation: 43728
The second sequence decodes to U+FFFD REPLACEMENT CHARACTER, which is used to replace an incoming character whose value is unknown or unrepresentable in Unicode.
This means you may see something like �.
There is nothing you could do on the client to fix that, the problem is on the server and needs to be fixed there.
Upvotes: 1
Reputation: 587
%C3%B6 is a valid encoded value for character ö so the value "m%C3%B6torhead album" is decoding perfectly. In second case "%EF%BF%BD" is not a valid encoded value for any characterset in UTF-8 encoding so it is not decoding it.
Upvotes: 0