Update: This looks like a memory issue. A 3.8 Gb Hprof file indicated that the JVM was dumping-its-heap when this "blocking" occurred. Our operations team saw that the site wasn't responding, took a stack trace, then shut down the instance. I believe they shut down the site before the heap dump finished. The log had no errors/exceptions/evidence of problems--probably because the JVM was killed before it could generate an error message.
Original Question We had a recent situation where the application appeared --to the end user--to hang. We got a stack trace before the application restart and I found some surprising results: of 527 threads, 463 had thread state BLOCKED.
In the Past In the past blocked thread usually had this issue: 1) some obvious bottleneck: e.g. some database record lock or file system lock problem which caused other threads to wait. 2) All blocked threads would block on the same class/method (e.g. the jdbc or file system clases)
Unusual Data In this case, I see all sorts of classes/methods blocked, including jvm internal classes, jboss classes, log4j, etc, in addition to application classes (including jdbc and lucene calls)
The question what would cause a JVM to block log4j.Hierarchy.getLogger, java.lang.reflect.Constructor.newInstance? Obviously some resource "is scarce" but which resource?
thanks
will
Stack Trace Excerpts
http-0.0.0.0-80-417" daemon prio=6 tid=0x000000000f6f1800 nid=0x1a00 waiting for monitor entry [0x000000002dd5d000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at java.lang.Class.newInstance0(Class.java:355)
at java.lang.Class.newInstance(Class.java:308)
at org.jboss.ejb.Container.createBeanClassInstance(Container.java:630)
http-0.0.0.0-80-451" daemon prio=6 tid=0x000000000f184800 nid=0x14d4 waiting for monitor entry [0x000000003843d000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2427)
at java.lang.Class.getMethod0(Class.java:2670)
"http-0.0.0.0-80-449" daemon prio=6 tid=0x000000000f17d000 nid=0x2240 waiting for monitor entry [0x000000002fa5f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.register(Http11Protocol.java:638)
- waiting to lock <0x00000007067515e8> (a org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.createProcessor(Http11Protocol.java:630)
"http-0.0.0.0-80-439" daemon prio=6 tid=0x000000000f701800 nid=0x1ed8 waiting for monitor entry [0x000000002f35b000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.log4j.Hierarchy.getLogger(Hierarchy.java:261)
at org.apache.log4j.Hierarchy.getLogger(Hierarchy.java:242)
at org.apache.log4j.LogManager.getLogger(LogManager.java:198)
These are listed roughly in the order I would try them, depending on the evidence collected:
newInstance()
and a few others above being blocked. Run your VM with -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc
and log the output. Are you seeing excessive GC times near the time of failure/lockup?
OutOfMemoryError
when you should, you can tune the GC tunables... see JDK6.0 XX options, or JDK6.0 GC Tuning Whitepaper. Look specifically at -XX:+UseGCOverheadLimit
and -XX:+GCTimeLimit
and related options. (note these are not well documented, but may be useful...)jconsole
can do this for you ... (yep, under the threads tab, "detect deadlocks")