Riyad Kalla
Riyad Kalla

Reputation: 10702

Using sun.misc.Unsafe, what is the fastest way to scan bytes from a Direct ByteBuffer?

BACKGROUND

Assume I have a direct ByteBuffer:

ByteBuffer directBuffer = ByteBuffer.allocateDirect(1024);

and assume I am passing the buffer to an AsynchronousSocketChannel to read chunks of data off that socket up to X bytes at a time (1024 in the example here).

The transfer time off the socket into the direct ByteBuffer is fantastic because it is all occurring in native OS memory space; I haven't passed through the JVM "blood-brain" barrier yet...

QUESTION

Assuming my job is to scan through all the bytes read back in from the direct byte buffer, what is the fastest way for me to do this?

I originally asked "... utilizing sun.misc.Unsafe" but maybe that is the wrong assumption.

POSSIBLE APPROACHES

I currently see three approaches and the one I am most curious about is #3:

  1. (DEFAULT) Use ByteBuffer's bulk-get to pull bytes directly from native OS space into an internal byte[1024] construct.
  2. (UNSAFE) Use Unsafe's getByte ops to pull the values directly out of the ByteBuffer skipping all the bounds-checking of ByteBuffer's standard get ops. Peter Lawrey's answer here seemed to suggest that those raw native methods in Unsafe can even be optimized out by the JIT compiler ("intrinsics") to single machine instructions leading to even more fantastic access time. (===UPDATE=== interesting, it looks like the underlying DirectByteBuffer class does exactly this with the get/put ops for those interested.)
  3. (BANANAS) In some crime-against-humanity sort of way, using Unsafe, can I copy the memory region of the direct ByteBuffer to the same memory address my byte[1024] exists at inside of the VM, and just start accessing the array using standard int indexes? (This makes the assumption that the "copyMemory" operation can potentially do something fantastically optimized at the OS level.

It does occur to me that assuming the copyMemory operation does exactly what it advertises, even in the more-optimal OS space, that the #2 approach above is probably still the most optimized since I am not creating duplicates of the buffer before beginning to process it.

This IS different than the "can I use Unsafe to iterate over a byte[] faster?" question as I am not even planning on pulling the bytes into a byte[] internally if it isn't necessary.

Thanks for the time; just curious if anyone (Peter?) has gotten nuts with Unsafe to do something like this.

Upvotes: 11

Views: 2930

Answers (1)

ZhongYu
ZhongYu

Reputation: 19682

ByteBuffer methods are extremely fast, because these methods are intrinsics, VM has mapped them to very low level instructions. Compare these two approaches:

    byte[] bytes = new byte[N];
    for(int m=0; m<M; m++)
        for(int i=0; i<bytes.length; i++)
            sum += bytes[i];

    ByteBuffer bb = ByteBuffer.allocateDirect(N);
    for(int m=0; m<M; m++)
        for(int i=0; i<bb.remaining(); i++)
            sum += bb.get(i);

on my machine, the difference is 0.67ns vs 0.81ns (per loop).

I'm a little surprised that ByteBuffer is not as fast as byte[]. But I think you should definitely NOT copy it to a byte[] then access.

Upvotes: 1

Related Questions