user1118764
user1118764

Reputation: 9855

Converting a java byte array to a String

I'm trying to convert a java byte array to a String as follows:

byte[] byteArr = new byte[128];
myFill(byteArr);
String myString = new String(byteArr);

myFill() populates byteArr with a string that's less than 128 characters long, and byteArr is zero padded. The code fine except myString somehow converts all the zero pads to some illegible characters. myString.length() also returns 128 instead of the actual ASCII content.

How do I rectify this?

Thanks!

Upvotes: 0

Views: 3305

Answers (1)

user2864740
user2864740

Reputation: 62003

As jtahlborn pointed out, there is nothing special about NUL (char = 0) in Java strings - it's just another character. Because of this, the (or, at least one) solution is to remove the extra characters when converting the source data it into a Java string.

To do that, use the String constructor overload that takes in an array offset/length and a charset:

byte[] byteArr = new byte[128];
myFill(byteArr);
String myString = new String(byteArr, 0, encodedStringLength, "US-ASCII");

Then it's just a matter of finding out the "encodedStringLength" which might look like so (after filling byteArr, of course):

int encodedStringLength = Arrays.asList(byteArr).indexOf(0);

That's not the "most efficient" way, sure, but it ought to do the trick. Keep in mind that indexOf could return -1 if the source string uses all 128 bytes (e.g. is not NUL terminated).

Also, one should generally (or, perhaps, always) specify a character encoding with String-from-byte[] constructors as the "default encoding" can vary across run-time environments. For instance, if the default encoding was UTF-16 then the original code would also have severely mangled the ASCII source data.


Alternatively, if one didn't care about leading/trailing spaces or control characters then the following would also work (once again, note the explicit character encoding):

String myString = new String(byteArr, "US-ASCII").trim();

This is because trim removes all leading/trailing characters with values less than or equal to 0x20 (Space) - including NUL characters.

Upvotes: 4

Related Questions