Zied Orabi
Zied Orabi

Reputation: 73

How to encode hexadecimal code to string using ISO-8859-6 charset?

I know there are other threads that answer this problem but for me, it is a little different.

I have many binary files containing different types of data that need to be shown(ASCII, hex ..).

So my usual method of displaying ASCII values is using the ISO-8859-1 included in StandardCharsets class. Sadly it doesn't support natively iso-8859.6 need to display Arabic characters. Here are my methods used for encoding:

First method gives me Hex codes as String:

public static String hexField(byte[] record, int offset, int length) {
     StringBuilder s = new StringBuilder(length * 2);
     int end = offset + length;

     for (int i = offset; i < end; i++) {
         int high_nibble = (record[i] & 0xf0) >>> 4;
         int low_nibble = (record[i] & 0x0f);
         s.append(hex_table[high_nibble]);
         s.append(hex_table[low_nibble]);
         
     }

     return s.toString();
}

Second method: Displays the ASCII field using the previous method:

private static String asciiField(byte[] record, int offset, int length) throws UnsupportedEncodingException {
    String field = hexField(record, offset, length) ; 
    
    byte[] fieldByte = javax.xml.bind.DatatypeConverter.parseHexBinary(field);
    return new String(fieldByte,StandardCharsets.ISO_8859_1).trim() ;
}

How can I display Arabic characters encoded in iso-8859.6 Thank you !

Upvotes: 1

Views: 910

Answers (1)

Joachim Sauer
Joachim Sauer

Reputation: 308001

While ISO-8859-6 is not required to be supported by the Java SE standard (and as such doesn't have a corresponding constant in StandardCharsets), I believe it is widely supported.

To use it, simply use the String constant "ISO-8859-6" where a character set is required, for example to convert a byte[] containing ISO-8859-6 data to a String, simply use

byte[] byteData = {(byte) 0xC2, (byte) 0xD4, (byte) 0xD8};
String s = new String(byteData, "ISO-8859-6");

This works on my machine just fine. (the byteData in that example almost certainly contains gibberish, since I don't know any Arabic, but it does represent some Arabic characters in ISO-8859-6).

Alternatively you can use Charset.forName("ISO-8859-6") if you want an actual Charset object at hand. Doing that also moves the UnsupportedEncodingException to the place where Charset.forName is called and doesn't litter every byte[]-to-String conversion place with that exception.

Also please note that hexField seems to do the exact opposite of parseHexBinary so those two methods chained together like that are a pointless byte[]->hex representation->byte[] conversion chain. There is even a String constructor that takes an offset and length that you could use:

private static final Charset ISO_8859_6 = Charset.forName("ISO-8859-6");

private static String textField(byte[] record, int offset, int length) {
    return new String(record, offset, length, ISO_8859_6).trim() ;
}

Upvotes: 2

Related Questions