Norah A
Norah A

Reputation: 11

How to convert a string to binary utf-16 and Binary to String in java?

my project requires converting Arabic text to binary , then convert binary to text (reverse process). I used this code,but I notice that when I use utf-16 to convert string to binary, then read this binary to convert it back to the original UTF-16 String will give me different chars

for example : the Arabic char used in encoding is (ن) which converted in binary (0100011000000110) by utf-16lE

now when I want to convert these binary bits(0100011000000110) to original utf-16 string will give me different character is F.

These problem just appears if the string is Arabic characters and utf-16 encoding. How I can solve this problem..?

// Convert the text to binary
public static String getBinaryFromText(String secretText) {
    byte[] bytes = secretText.getBytes(StandardCharsets.UTF_16LE);
    StringBuilder binary = new StringBuilder();
    for (byte b : bytes) {
        int val = b;
        for (int i = 0; i < 8; i++) {
            binary.append((val & 128) == 0 ? 0 : 1);
            val <<= 1;
        }
    }

    return binary.toString();
}

// Convert the binary to text.
public static String getTextFromBinary(String binaryString) {
    String binary = binaryString.replace(" ", "");
    String binaryPer8Bits;
    byte[] byteData;
    byteData = new byte[binary.length() / 8];

    for (int i = 0; i < binary.length() / 8; i++) {
        // To divide the string into 8 characters
        binaryPer8Bits = binary.substring(i * 8, (i + 1) * 8);
        // The integer of decimal string of binary numbers
        Integer integer = Integer.parseInt(binaryPer8Bits, 2);
        // The casting to a byte type variable
        byteData[i] = integer.byteValue();
    }

    return new String(byteData);
}

Upvotes: 0

Views: 1131

Answers (1)

k5_
k5_

Reputation: 5568

With new String(byteData); you interpret the create byte[] with default encoding. To interpret it as UTF_16LE you need to use a different constructor:

new String(byteData, StandardCharsets.UTF_16LE);

(Almost) never use the new String(byte[]) it will use the default encoding, so your application will be platform dependend.

Upvotes: 5

Related Questions