Reputation: 137
We have people sending non-printable \x86 type of characters in byte array in Java and when we convert it to us-ascii string, it inserts junk character in the ascii text.
Is there a format for string/other way to handle non-printable ascii characters while converting data from formats like EBCDIC to ASCII in Java?
Upvotes: 0
Views: 2315
Reputation: 109557
If you are in the US or "Western Europe" (UK, France, Germany), the character set probably is Windows-1252. Single-byte charset US-ASCII covers 128 characters, single-byte charset Windwos-1252 is a superset covering all 255 characters in the byte range.
Easiest is a translation table for \u0080 - \u00ff. String, as some might be better replaced by several chars, say \u008c by "OE".
Upvotes: 0
Reputation: 832
How would you like to handle them? Replace them with something printable (such as '?')? Remove them entirely? Some other action?
Upvotes: 1