Reputation: 66
I wrote a JMH benchmark to diagnose so odd throughput issues I was seeing when utilizing Netty buffers in place of NIO buffers. The Netty direct ByteBuf is significantly slower when writing byte by byte than that of its NIO ByteBuffer counterpart. Even more interesting is that if I get an NIO ByteBuffer from the Netty ByteBuf the performance is on par with NIO ByteBuffer. So I can be certain it isn't the underlying direct memory or the internal ByteBuffer but something in the layers of ByteBuf. Is this expected? Am I using it wrong?
Here are the raw results.
Benchmark Mode Cnt Score Error Units
ByteBufferBenchmark.directByteBuffer thrpt 2 206815.012 ops/s
ByteBufferBenchmark.heapByteBuffer thrpt 2 159197.697 ops/s
ByteBufferBenchmark.pooledDirectByteBuf thrpt 2 120753.217 ops/s
ByteBufferBenchmark.pooledDirectByteBufAsByteBuffer thrpt 2 204986.976 ops/s
ByteBufferBenchmark.pooledHeapByteBuf thrpt 2 121846.543 ops/s
ByteBufferBenchmark.pooledHeapByteBufAsByteBuffer thrpt 2 159503.425 ops/s
ByteBufferBenchmark.unpooledDirectByteBuf thrpt 2 121781.355 ops/s
ByteBufferBenchmark.unpooledDirectByteBufAsByteBuffer thrpt 2 208623.215 ops/s
ByteBufferBenchmark.unpooledHeapByteBuf thrpt 2 158904.532 ops/s
ByteBufferBenchmark.unpooledHeapByteBufAsByteBuffer thrpt 2 160171.685 ops/s
directByteBuffer = ByteBuffer.allocateDirect
heapByteBuffer = ByteBuffer.allocate
*DirectByteBuf = ByteBufAllocator.allocateDirect
*HeapByteBuf = ByteBufAllocator.allocateHeap
pool* = PooledByteBufAllocator.DEFAULT
unpool* = PooledByteBufAllocator.DEFAULT
*AsByteBuffere = ByteBuffer.nioBuffer
Upvotes: 1
Views: 521
Reputation: 191
Please share the whole code (and the jdk version) if possible or I cannot understand in which operations the Netty buffers seems to be slower.
As a general rule of thumb: JDK classes can (and many of them very likely will) benefit from being "good citizens" with the JVM itself ie their operations are intrinsified (see the general concept on https://en.m.wikipedia.org/wiki/Intrinsic_function) and by consequence their optimized code inlined (see http://normanmaurer.me/blog/2014/05/15/Inline-all-the-Things/). In short, NIO ByteBuffer play "dirty" against Netty ByteBuf, depending to the version of the JVM, benefitting with several optimizations just not accessible to regular user defined data types. Returning to your question: yes can be expected, depending on the operations and the calling context (which influence inlining of ByteBuf operations, chances of vectorization and bound checks elimination). I've recently fixed an issue on Netty related this on https://github.com/netty/netty/pull/10368 : feel free to dive in the long list of comments, I am sure will help to answer your question.
Upvotes: 2