rajaramyadhav
rajaramyadhav

Reputation: 347

Converting a byte array to string without using new operator in Java

Is there a way to convert a byte array to string other than using new String(bytearray)? The exact problem is I transmit a json-formatted string over the network through UDP connection. At the other end, I receive it in a fixed-size byte array(as I am not aware of the array size) and create a new string out of the byte array. If I do this, the whole memory that I allocated is being held unnecessary.

To avoid this I get the byte array convert it to string, truncate the string till the last valid character and then convert it to a byte array and create a new string out of it. If I do this, it just uses up the required memory but the garbage collection frequency becomes so high as it involves more number of allocations. What is the best way to do this?

Upvotes: 5

Views: 7015

Answers (4)

spring.ace
spring.ace

Reputation: 181

Can you write the input stream to a ByteArrayOutputStream first then call toString on the output stream? So something like this:

ByteArrayOutputStream os = new ByteArrayOutputStream();
while (!socket.isClosed()) {
    InputStream is = socket.getInputStream();
    byte[] buffer = new byte[1024]; // some tmp buffer.  Define the appropriate size here
    int bytesRead;
    while ((bytesRead = is.read(buffer)) != -1) {
        baos.write(buffer, 0, bytesRead);
        if (is.available() <= 0) {
            break;
        }
    }
    System.out.println(baos.toString());
    baos.reset();
}

Upvotes: 0

Stephen C
Stephen C

Reputation: 718788

The simplest and most reliable way to do this is to use the length of the packet that you read from the UDP socket. The javadoc for DatagramSocket.receive(...) says this:

Receives a datagram packet from this socket. When this method returns, the DatagramPacket's buffer is filled with the data received. The datagram packet also contains the sender's IP address, and the port number on the sender's machine.

This method blocks until a datagram is received. The length field of the datagram packet object contains the length of the received message. If the message is longer than the packet's length, the message is truncated.

If you cannot do that, then the following will allocate a minimum sized String with no unnecessary allocation of temporaries.

  byte[] buff = ... // read from socket.

  // Find byte offset of first 'non-character' in buff
  int i;
  for (i = 0; i < buff.length && /* buff[i] represents a character */; i++) { /**/ }

  // Allocate String
  String res = new String(buff, 0, i, charsetName);

Note that the criterion for determining a non-character is character set and application specific. But probably testing for a zero byte is sufficient.

EDIT

What does the javadoc exactly mean by "The length of the new String is a function of the charset, and hence may not be equal to the length of the subarray."

It is pointing to the fact that for some character encodings (for example UTF-8, UTF-16, JIS, etc) some characters are represented by two or more bytes. So for example, 10 bytes of UTF-8 might represent fewer than 10 characters.

Upvotes: 2

Michael Burr
Michael Burr

Reputation: 340198

Would something like:

String s = new String( bytearray, 0, lenOfValidData, "US-ASCII");

do what you want (change the charset to whatever encoding is appropriate)?


Update:

Based on your comments, you might want to try:

socket.receive(packet);
String strPacket = new String( packet.getData(), 0, packet.getLength(), "US-ASCII");
receiver.onReceive( strPacket);

I'm not familiar enough with Java's datagram support to know if packet.getLength() returns the truncated length or the original length of the datagram (before truncation to fit in the receive buffer). It might be safer to create the string like so:

String strPacket = new String( packet.getData(), 
                               0, 
                               Math.min( packet.getLength(), packet.getData().length),
                               "US-ASCII");

Then again, it might be unnecessary.

Upvotes: 2

Dave O.
Dave O.

Reputation: 2281

You could avoid the second String creation by using a StringBuilder. I imagine Your data receiving process to look like this:

  1. Get the (fixed size) byte array at client side.
  2. Create a StringBuilder object.
  3. Loop over the array as long as You read valid characters and append them to the StringBuilder object.
  4. The byte array can be thrown away now. (I would rather keep it though for the next time You receive something over the network in order to avoid unnecessary memory allocations.)
Edit

I followed the suggestion of Tofubeer to use a StringBuilder instead of a StringBuffer.

Upvotes: 0

Related Questions