Reputation: 971
I receive a String in ISO-8859-1 encoding but some characters are not decoded correctly...
here is the code I'm using:
InputStream plainIs = plainText.getIs();
StringBuilder stringBuilder = new StringBuilder();
String line = null;
try (BufferedReader bufferedReader = new BufferedReader(new
InputStreamReader(plainIs, "iso-8859-1"))) {
while ((line = bufferedReader.readLine()) != null) {
stringBuilder.append(line);
}
}
body = stringBuilder.toString();
log.debug("Plain Text Body: "+body);
as an input, I have a sentence like this:
L=92objet est donc de proposer un outil simple =E9volutif
but the translation is
L�objet est donc de proposer un outil simple évolutif
the character =E9 is correctly translated in é but the character L=92 is translated like this: L�objet
any idea why I have only a partial conversion ?
Upvotes: 1
Views: 170
Reputation: 4667
It seems 92 is not defined in ISO-8859-1
(nothing in the 90s are) as you can see on this page in the chart. It shows é
as E9
which is why it is outputting correctly. If you are attempting to get '
as a character, try using =27
instead of =92
.
There is also the superset of ISO-8859-1
with Windows-1252
found here, which does have 92
defined in the second version:
The second version, used in Microsoft Windows 2.0, positions D7, F7, 91, and 92 had been defined.
Upvotes: 1