Reputation: 2164
I'm trying to read a binary file in Java. I need methods to read unsigned 8-bit values, unsigned 16-bit value and unsigned 32-bit values. What would be the best (fastest, nicest looking code) to do this? I've done this in c++ and did something like this:
uint8_t *buffer;
uint32_t value = buffer[0] | buffer[1] << 8 | buffer[2] << 16 | buffer[3] << 24;
But in Java this causes a problem if for example buffer[1] contains a value which has it sign bit set as the result of a left-shift is an int (?). Instead of OR:ing in only 0xA5 at the specific place it OR:s in 0xFFFFA500 or something like that, which "damages" the two top bytes.
I have a code right now which looks like this:
public long getUInt32() throws EOFException, IOException {
byte[] bytes = getBytes(4);
long value = bytes[0] | (bytes[1] << 8) | (bytes[2] << 16) | (bytes[3] << 24);
return value & 0x00000000FFFFFFFFL;
}
If I want to convert the four bytes 0x67 0xA5 0x72 0x50 the result is 0xFFFFA567 instead of 0x5072A567.
Edit: This works great:
public long getUInt32() throws EOFException, IOException {
byte[] bytes = getBytes(4);
long value = bytes[0] & 0xFF;
value |= (bytes[1] << 8) & 0xFFFF;
value |= (bytes[2] << 16) & 0xFFFFFF;
value |= (bytes[3] << 24) & 0xFFFFFFFF;
return value;
}
But isn't there a better way to do this? 10 bit-operations seems a "bit" much for a simple thing like this.. (See what I did there?) =)
Upvotes: 4
Views: 27713
Reputation: 56772
A more regular version converts the bytes to their unsigned values as integers first:
public long getUInt32() throws EOFException, IOException {
byte[] bytes = getBytes(4);
long value =
((bytes[0] & 0xFF) << 0) |
((bytes[1] & 0xFF) << 8) |
((bytes[2] & 0xFF) << 16) |
((long) (bytes[3] & 0xFF) << 24);
return value;
}
Don't get hung up on the number of bit operations, most likely the compiler will optimize those to byte operations.
Also, you shouldn't be using long
for 32-bit values just to avoid the sign, you can use int
and ignore the fact that it is signed most of the time. See this answer.
Update: The cast to long for the most significant byte is needed, because its most significant bit would otherwise be shifted into the sign bit of a 32-bit integer, potentially making it negative.
Upvotes: 5
Reputation: 23265
You've got the right idea, I don't think there's any obvious improvement. If you look at the java.io.DataInput.readInt
spec, they have code for the same thing. They switch the order of <<
and &
, but otherwise standard.
There is no way to read an int
in one go from a byte
array, unless you use a memory-mapped region, which is way overkill for this.
Of course, you could use a DataInputStream
directly instead of reading into a byte[]
first:
DataInputStream d = new DataInputStream(new FileInputStream("myfile"));
d.readInt();
DataInputStream
works on the opposite endianness than you are using, so you'll need some Integer.reverseBytes
calls also. It won't be any faster, but it's cleaner.
Upvotes: 2