binary data manipulation in java

Question

I spent the better part of a day chasing down a binary reconstruction bug, and want to understand why:

the specific line of code looked like this (dataBuffer is an array of bytes):

short data = (short) ((short)dataBuffer[curPos + 3] << 8 | ((short)dataBuffer[curPos + 2]));

it sporadically returned garbage until i added a mask to the low-order word:

short data = (short) ((short)dataBuffer[curPos + 3] << 8 | (((short)dataBuffer[curPos + 2])) & 0xff);

so, my interpretation is that the type-cast from byte to short occasionally leaves behind trash in the high-order word, causing issues when it's or-ed... but that doesn't make a whole lot of sense.

this code is taken from c++ and worked great there... what am i missing?

Ted Hopp · Accepted Answer

It's sign extension. All byte values in Java are signed, so any byte value greater than 127 is actually a negative number. Thus, a byte value of, say, 0x90 (= 144 decimal) is actually treated as -112 when it is a byte. When it is widened to a short it becomes 0xff90 (still -112). You need to mask the value with 0xff to recover the desired bit pattern of 0x0090.

As an aside, your can eliminate a couple of casts from your second expression:

short data = (short) ((dataBuffer[curPos + 3] << 8) | (dataBuffer[curPos + 2] & 0xff));

Those casts are, in fact, quite useless. Operands to bitwise operators are always promoted to int values¹ before the operator is applied.

¹ Or long, if any of them are long.

binary data manipulation in java

Answers (1)

Related Questions