Reputation: 57325
I've seen other references to this issue, such as here and here, although these reference different versions of Netty. Tried this using the latest in the 4.0 branch (4.0.29) and in the 5.0 alpha branch (5.0-Alpha3). Local (non-linux) jdk 1.8.040, fine. Remote (Linux) with java jdk 1.8.025-b17 get 100% cpu. Linux kernel version 2.6.32.
Tried using EpollEventLoopGroup();
Tried calling
workerGroup = new NioEventLoopGroup();
workerGroup.rebuildSelectors();
Can anyone offer any suggestions? I've seen references to this bug w/different versions of Netty. Jdk bug? Netty bug? Process goes to 100% immediately on startup and stays there.
Update: Upgraded to java 1.8.045, same difference.
JStack output of all runnable threads (there's some rabbitmq stuff in there, only included for completeness - that's common to other applications, and is not the cause of the problem).
Upvotes: 14
Views: 4675
Reputation: 13696
As we identified in the comments, the thread that consumed CPU is busy in the following stack:
"pool-9-thread-1" #49 prio=5 os_prio=0 tid=0x00007ffd508e8000 nid=0x3a0c runnable [0x00007ffd188b6000]
java.lang.Thread.State: RUNNABLE
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I have managed to reproduce a similar behavior by creating a ScheduledThreadPoolExecutor
, configuring it to allow core threads to time out, and scheduling a lot of repeating tasks with a short delay. It yields a lot of CPU on my machine and the jstack
output is similar (sometimes deeper into the poll
method). This code reproduces it:
ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1);
executor.setKeepAliveTime(1, TimeUnit.MINUTES);
executor.allowCoreThreadTimeOut(true);
for (long i = 0; i < 1000; i++) {
executor.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
}
}, 0, 1, TimeUnit.NANOSECONDS);
}
Now we just have to identify which code sets up a broken ScheduledThreadPoolExecutor
. I searched through the RabbitMQ and Netty source code without finding anything obvoius. Could it be something you do in your own code?
Edit: As mentioned in the comments, the root cause was a ScheduledThreadPoolExecutor
initialized with 0
which apparently can cause a CPU spin om some platforms. This was done in the OP's code.
Upvotes: 12