carlspring
carlspring

Reputation: 32617

ByteArrayOutputStream.toString() generating extra characters

I have the following code:

ByteArrayOutputStream baos = new ByteArrayOutputStream();

int size = 4096;
byte[] bytes = new byte[size];

while (is.read(bytes, 0, size) != -1)
{
    baos.write(bytes);
    baos.flush();
}

When I do:

String s = baos.toString();

I get \u0000-s appended to my string. So, if my character data is only X bytes out of Y, the Y-Z will get prefilled with \u0000 making it impossible to check for equals. What am I doing wrong here? How should I be converting the bytes to a String in this case?

Upvotes: 0

Views: 1657

Answers (3)

Ted Hopp
Ted Hopp

Reputation: 234807

You should only be writing as much data as you are reading in each time through the loop:

ByteArrayOutputStream baos = new ByteArrayOutputStream();

int size;
byte[] bytes = new byte[4096];

while (size = is.read(bytes, 0, bytes.length) != -1)
{
    baos.write(bytes, 0, size);
}
baos.flush();
String s = baos.toString();

You might consider specifying a specific character set for converting the bytes to a String. The no-arg toString() method uses the platform default encoding.

Upvotes: 3

user2864740
user2864740

Reputation: 61885

The entire array (all 4096 bytes) is be written to the output - arrays have no idea of how much "useful data" they contain!

Store how much was read into a variable (InputStream.read returns a useful number) and specify that to the appropriate OutputStream.write overload to only write a portion (that which contains the useful data) of the array.

While the above change should "fix" the problem, it is generally recommended to use the string<->byte[] conversion forms that take in an explicit character set.

Upvotes: 4

Related Questions