Reputation: 6678
I have a linux trusty on aws m4.xlarge so 4 CPU, 16 GB RAM. It's running a java application on tomcat7 and oracle java 8.
Very frequently the app will hang and won't accept any other connection. Status cake will report it as down since the response times out. Datadog will show threads are maxed out. But there is no increase in CPU (barely 10% of usage). RAM usage remains unchanged during that period.
Only a tomcat restart fixes the problem temporarily(12h approx ). So I have taken a thread dump and seen so many threads in a waiting state. Since this is very new to me, I am blind even with the data.
I was hoping I could get help here and eventually master the art of ciphering a thread dump file. I have attached it here and I have as well uploaded it to fastthread.io and it says there is no problem. I have also uploaded the full threadump on zerobin
I would be very grateful if anyone here can shed some lights on this and I hope it will help others in the same situation. Thanks in advance.
Upvotes: 1
Views: 255
Reputation: 646
Lots of threads are in WAITING state, and it's absolutely ok for them. For example, there are thread which have the following stack trace:
...
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
...
This only means threads are waiting for any tasks to do.
However, other stacks do not look good.
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at com.mchange.v2.resourcepool.BasicResourcePool.**awaitAvailable**(BasicResourcePool.java:1414)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:606)
- locked <0x000000055c2d3ce0> (a com.mchange.v2.resourcepool.BasicResourcePool)
at
com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:526)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:755)
Those threads are waiting for connection to be free in the pool. C3P0 is a pool of database connections. Instead of creating a new connection every time, they are cached in the pool. Upon closing, the connection itself is not closed, but only returned to the pool. So, if hibernate for some reason (or other user) do not close connection after releasing it, then pool can get exhausted.
In order to resolve an issue, you have to find out why some connections are not closed after using. Try to look at your code to do this.
The other option is to temporarily go without C3P0 (pooling). This is not forever, but at least you can check whether this guess is right.
Upvotes: 2