Reputation: 116159
I need to convert numerical values into byte arrays. For example, to convert a long to a byte array, I have this method:
public static byte[] longToBytes(long l) {
ByteBuffer buff = ByteBuffer.allocate(8);
buff.order(ByteOrder.BIG_ENDIAN);
buff.putLong(l);
return buff.array();
}
It's pretty straightforward - take a long, allocate an array that can hold it, and throw it in there. Regardless of what the value of l
is, I will get an 8 byte array back that I can then process and use as intended. In my case, I'm creating a custom binary format and then transmitting it over a network.
When I invoke this method with a value of 773450364, I get an array [0 0 0 0 46 25 -22 124]
back. I have code that also converts byte arrays back into their numerical values:
public static Long bytesToLong(byte[] aBytes, int start) {
byte[] b = new byte[8];
b[0] = aBytes[start + 0];
b[1] = aBytes[start + 1];
b[2] = aBytes[start + 2];
b[3] = aBytes[start + 3];
b[4] = aBytes[start + 4];
b[5] = aBytes[start + 5];
b[6] = aBytes[start + 6];
b[7] = aBytes[start + 7];
ByteBuffer buf = ByteBuffer.wrap(b);
return buf.getLong();
}
When I pass the array from the other method back into this method, I get 773450364, which is correct.
Now, I transmit this array over TCP to another Java client. The documentation for the java.io.InputStream.read()
method says that it returns a int
value between 0 and 255, unless the end of the stream is reached and a -1 is returned. However, when I use it to populate a byte array, I continue to get the negative values on the receiving side. I suspect this has to do with overflow (a value of 255 can not fit into a Java byte, so when I put it into the byte array, it overflows and becomes negative).
This brings me to my problem. The existance of the negative numbers concerns me. Right now, I'm developing the Java side of an application, where a byte is between -128 and 127 inclusive. The other endpoint might be in C, C++, Python, Java, C#...who knows. I'm not sure how the existance of a negative value in some byte arrays are going to affect processing. Other than documenting this behavior, what can/should I do to make it easier for myself and future developers working on this system, especially in endpoints that are not written in Java?
Upvotes: 4
Views: 3141
Reputation: 310860
If all your data is big-endian you can save yourself all this trouble and use a DataOutputStream. It has all you need.
Upvotes: 0
Reputation: 11999
Both your sending and receiving endpoints are currently implemented in Java. Conceivably, you're using an OutputStream
on the sending side and an InputStream
on the receiving side. Assuming we can trust the underlying socket implementation details for a moment, we'll consider any byte sent over the socket to arrive at its destination exactly the same.
So what actually happens on Java level when dumping something into the OutputStream? When checking the JavaDoc for a method writing a byte array, we see that all this tells us is that bytes are being sent over the stream. Nothing major there. But when you check the doc for the method taking an int as argument, you'll see it details how this int is actually written out: the lower-order 8 bits are sent over the stream as a byte, while the higher-order 24 bits (int having a 32-bit representation in Java) are simply ignored.
Over to the receiving side. You've got an InputStream. Unless you use one of the methods reading directly into a byte array, you'll be given an int. Like the doc says, the int will either be a value between 0 and 255 inclusive, or -1 if the end of the stream has been reached. This is the important bit. On the one hand, we want every possible bit pattern of a single byte to be readable from an InputStream. But we must also have some way of detecting when a read no longer can return meaningful values. That's why that method returns an int instead of a byte... The -1 value is that flag saying the end of the stream was reached. If you get anything else than -1, the only thing of interest is those lower 8 bits. Since these can be any bit pattern, their decimal value will range from -128 to 127 inclusive. When you read directly into a byte array instead of int per int, that "trimming" is gonna be done for you. So it makes sense that you're gonna see those negative values. That said, they're only negative because of the way Java represents a byte as a signed decimal. The only thing of interest is the actual bit pattern. For all you care it could represent values 0 to 255 or 1000 to 1255.
A typical InputStream read loop that uses one byte at a time is gonna look like this:
InputStream ips = ...;
int read = 0;
while((read = ips.read()) != -1) {
byte b = (byte)read;
//b will now have a bit pattern ranging from 0x00 to 0xff in hex, or -128 to 127 in two-complement signed representation
}
When run, the following (uses Java 7 int literals) will be illuminating:
public class Main {
public static void main(String[] args) {
final int i1 = Ox00_00_00_fe;
final int i1 = Ox80_00_00_fe;
final byte b1 = (byte)i1;
final byte b2 = (byte)i2;
System.out.println(i1);
System.out.println(i2);
System.out.println(b1);
System.out.println(b2);
final int what = Ox12_34_56_fe;
final byte the_f = (byte)what;
System.out.println(what);
System.out.println(the_f);
}
}
As will be clear from this, casting from int to byte will simply ditch anything but the least significant 8 bits. So the int could be a positive or negative number, it won't have any bearing on the byte value. Only the last 8 bits.
Long story short: you're getting correct byte values from your InputStream. The real worry here is that if the client side could be written in any programming language and run on any platform, you'll need to make it abundantly clear in your documentation what the received bytes mean and if they're a long
, how this is encoded. Make it clear that the encoding is done in Java, using ByteBuffer
's putLong
method in a specific endianness. Only then will they have the info (combined with Java specs) to be absolutely certain how to interpret those bytes.
Upvotes: 1
Reputation: 206786
A byte
in Java is represented in 8-bit two's complement format. If you have an int
that is in the range 128 - 255 and you cast it to a byte
, then it will become a byte
with a negative value (between -1 and -128).
After reading a byte, you must check if it is -1 before you cast it to byte
. The reason why the method returns an int
rather than a byte
is to allow you to check for end-of-stream before you convert it to a byte
.
Another thing: Why are you copying the aBytes
array in your bytesToLong
method? You can simplify that method considerably and save the unncessary copy:
public static Long bytesToLong(byte[] aBytes, int start) {
return ByteBuffer.wrap(aBytes, start, 8).order(ByteOrder.BIG_ENDIAN).getLong();
}
Upvotes: 6