Reputation: 73
I know there are other threads that answer this problem but for me, it is a little different.
I have many binary files containing different types of data that need to be shown(ASCII, hex ..).
So my usual method of displaying ASCII values is using the ISO-8859-1 included in StandardCharsets class. Sadly it doesn't support natively iso-8859.6 need to display Arabic characters. Here are my methods used for encoding:
First method gives me Hex codes as String:
public static String hexField(byte[] record, int offset, int length) {
StringBuilder s = new StringBuilder(length * 2);
int end = offset + length;
for (int i = offset; i < end; i++) {
int high_nibble = (record[i] & 0xf0) >>> 4;
int low_nibble = (record[i] & 0x0f);
s.append(hex_table[high_nibble]);
s.append(hex_table[low_nibble]);
}
return s.toString();
}
Second method: Displays the ASCII field using the previous method:
private static String asciiField(byte[] record, int offset, int length) throws UnsupportedEncodingException {
String field = hexField(record, offset, length) ;
byte[] fieldByte = javax.xml.bind.DatatypeConverter.parseHexBinary(field);
return new String(fieldByte,StandardCharsets.ISO_8859_1).trim() ;
}
How can I display Arabic characters encoded in iso-8859.6 Thank you !
Upvotes: 1
Views: 910
Reputation: 308001
While ISO-8859-6 is not required to be supported by the Java SE standard (and as such doesn't have a corresponding constant in StandardCharsets
), I believe it is widely supported.
To use it, simply use the String constant "ISO-8859-6"
where a character set is required, for example to convert a byte[]
containing ISO-8859-6 data to a String
, simply use
byte[] byteData = {(byte) 0xC2, (byte) 0xD4, (byte) 0xD8};
String s = new String(byteData, "ISO-8859-6");
This works on my machine just fine. (the byteData
in that example almost certainly contains gibberish, since I don't know any Arabic, but it does represent some Arabic characters in ISO-8859-6).
Alternatively you can use Charset.forName("ISO-8859-6")
if you want an actual Charset
object at hand. Doing that also moves the UnsupportedEncodingException
to the place where Charset.forName
is called and doesn't litter every byte[]
-to-String
conversion place with that exception.
Also please note that hexField
seems to do the exact opposite of parseHexBinary
so those two methods chained together like that are a pointless byte[]
->hex representation->byte[]
conversion chain. There is even a String
constructor that takes an offset and length that you could use:
private static final Charset ISO_8859_6 = Charset.forName("ISO-8859-6");
private static String textField(byte[] record, int offset, int length) {
return new String(record, offset, length, ISO_8859_6).trim() ;
}
Upvotes: 2