shantanu
shantanu

Reputation: 2418

Cannot convert String to byte array and vice versa in Java

I am trying to convert a byte array to String. But the conversion alter the values. That means I cannot restore the byte array from the converted String.

byte[] array = {-64,-88,1,-2};
ByteArrayOutputStream out = new ByteArrayOutputStream();
out.write(array);
String result = out.toString("UTF-8");
byte[] array2 = result.getBytes("UTF-8");
// output of array2: {-17,-65,-67,-17}

Upvotes: 3

Views: 2360

Answers (3)

morpheus05
morpheus05

Reputation: 4872

You have to use a fixed single byte encoding, like the one Jan suggested. UTF-8 is a non-fixed encoding, that means, in certain cases you need more then one byte to encode a single code point. This is one of this cases since you use negative numbers. (See the table in the wiki page about utf-8)

What was interesting for me was the fact, that after converting the second array to a string, the strings were identical but the underlying arrays where not. But the point is, that the given character are not legit code points (or utf-8 representation of it) in which case the get replaced with the code point 65533, which in turn needs 3 bytes to be represented which explains the output:

[-17, -65, -67, -17, -65, -67, 1, -17, -65, -67]

The first two code points are represented as -17, -65, -67 and represent the illegal code point. The 1 represents a legit code point, so it "survived" the transformation and then last is again an illegal one.

Upvotes: 2

PixelKicker
PixelKicker

Reputation: 5

I believe you can create a string out of an byte array by passing the array into the constructor like this

String test = new String(byte_array);

Also there's a method for String to convert a String to a byte-array that returns the array

I hope that helped at least a bit

Upvotes: -1

Jan
Jan

Reputation: 13858

It's a charset issue - utf-8 has more than 1 byte per char. Try the same with some 1-byte charset like

String result = out.toString("ISO-8859-15");
byte[] array2 = result.getBytes("ISO-8859-15");

Upvotes: 4

Related Questions