Recently one of our production tomcat server became unresponsive because tomcat's busy threads shot upto 200. When we took thread dump before restarting we got 100 threads in TIMED_WAITING state like these 3 threads:
""http-bio-7007"-exec-241" daemon prio=10 tid=0x00002aaab107b000 nid=0x59df waiting on condition [0x0000000051239000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000580d877d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:424)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:86)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:32)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)
""http-bio-7007"-exec-237" daemon prio=10 tid=0x00002aaab186e000 nid=0x596d waiting on condition [0x000000004d1f9000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000580d877d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:424)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:86)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:32)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)
""http-bio-7007"-exec-236" daemon prio=10 tid=0x00002aaab1118000 nid=0x596c waiting on condition [0x000000004e50c000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000580d877d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:424)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:86)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:32)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)
We have 4 application's thread pools(e.g. pool-4-thread-20 etc) also which are having 20 threads each so I am not sure on which blocking queue these 100 threads waiting? We are using c3P0 connection pool with hibernate which doesn't seems to be cause of this.
Any idea what java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject is?
This got fixed when we fixed our code which was leaking DB connections managed by c3p0. There were few flows in our code where we were not calling rollback() specifically in catch block before closing entity Manager in finally block, so in case of exceptions connection was not coming back to pool and if frequency of exception is high(more than size of pool within timeout interval) then all other process threads would pile up to get connection.