interpreting jstack output

Alex Averbuch picture Alex Averbuch · Aug 10, 2013 · Viewed 10.1k times · Source

I have a java process loading a lot of data from a bunch of .csv files into a Neo4j database using the BatchInserter. I was using:

  • OpenJDK 7
  • Ubuntu 12.04
  • Neo4j 2.0 M3

After loading the first 164 GB (according to ls -lh) the folder size stopped increasing but the process kept running, no memory was released, and CPU was still at 100% (all according to htop).

The loading process is single threaded, only the JVM is using more than 1 thread - I guess by the ParallelGC.

I'm not sure how to diagnose this type of problem but was instructed to try jstack, so have included its output below.

Anyone has an idea what went wrong or has suggestions on how I can go about diagnosing this?

Full thread dump OpenJDK 64-Bit Server VM (22.0-b10 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007fc3a4001000 nid=0x5636 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" daemon prio=10 tid=0x00007fcf58123000 nid=0x4545 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007fcf58120800 nid=0x4544 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007fcf5811d800 nid=0x4543 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007fcf5811b800 nid=0x4542 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007fcf580c4800 nid=0x4541 in Object.wait() [0x00007fc3f3cfb000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007fc4165fc708> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x00007fc4165fc708> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00007fcf580c2000 nid=0x4540 in Object.wait() [0x00007fc3f3dfc000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007fc4165ffe08> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        - locked <0x00007fc4165ffe08> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007fcf58007800 nid=0x452c runnable [0x00007fcf606b7000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.String.substring(String.java:1960)
        at java.lang.String.subSequence(String.java:1993)
        at java.util.regex.Pattern.split(Pattern.java:1202)
        at com.ldbc.socialnet.neo4j.CsvFileReader.parseLine(CsvFileReader.java:73)
        at com.ldbc.socialnet.neo4j.CsvFileReader.nextLine(CsvFileReader.java:61)
        at com.ldbc.socialnet.neo4j.CsvFileReader.hasNext(CsvFileReader.java:33)
        at com.ldbc.socialnet.neo4j.CsvFileInserter.bufferLines(CsvFileInserter.java:62)
        at com.ldbc.socialnet.neo4j.CsvFileInserter.insertAllBuffered(CsvFileInserter.java:93)
        at com.ldbc.socialnet.neo4j.LdbcSocialNeworkNeo4jImporter.load(LdbcSocialNeworkNeo4jImporter.java:79)
        at com.ldbc.socialnet.neo4j.LdbcSocialNeworkNeo4jImporter.main(LdbcSocialNeworkNeo4jImporter.java:49)

"VM Thread" prio=10 tid=0x00007fcf580ba000 nid=0x453f runnable 

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007fcf58012800 nid=0x452d runnable 

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007fcf58014800 nid=0x452e runnable 

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007fcf58016000 nid=0x452f runnable 

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007fcf58018000 nid=0x4530 runnable 

"GC task thread#4 (ParallelGC)" prio=10 tid=0x00007fcf5801a000 nid=0x4531 runnable 

"GC task thread#5 (ParallelGC)" prio=10 tid=0x00007fcf5801b800 nid=0x4532 runnable 

"GC task thread#6 (ParallelGC)" prio=10 tid=0x00007fcf5801d800 nid=0x4533 runnable 

"GC task thread#7 (ParallelGC)" prio=10 tid=0x00007fcf5801f800 nid=0x4534 runnable 

"GC task thread#8 (ParallelGC)" prio=10 tid=0x00007fcf58021000 nid=0x4535 runnable 

"GC task thread#9 (ParallelGC)" prio=10 tid=0x00007fcf58023000 nid=0x4536 runnable 

"GC task thread#10 (ParallelGC)" prio=10 tid=0x00007fcf58025000 nid=0x4537 runnable 

"GC task thread#11 (ParallelGC)" prio=10 tid=0x00007fcf58026800 nid=0x4538 runnable 

"GC task thread#12 (ParallelGC)" prio=10 tid=0x00007fcf58028800 nid=0x4539 runnable 

"GC task thread#13 (ParallelGC)" prio=10 tid=0x00007fcf5802a800 nid=0x453a runnable 

"GC task thread#14 (ParallelGC)" prio=10 tid=0x00007fcf5802c800 nid=0x453b runnable 

"GC task thread#15 (ParallelGC)" prio=10 tid=0x00007fcf5802e000 nid=0x453c runnable 

"GC task thread#16 (ParallelGC)" prio=10 tid=0x00007fcf58030000 nid=0x453d runnable 

"GC task thread#17 (ParallelGC)" prio=10 tid=0x00007fcf58032000 nid=0x453e runnable 

"VM Periodic Task Thread" prio=10 tid=0x00007fcf5812d800 nid=0x4546 waiting on condition 

JNI global references: 175

Heap
 PSYoungGen      total 11211840K, used 7045440K [0x00007fcb95000000, 0x00007fcf3d160000, 0x00007fcf55000000)
  eden space 7045440K, 100% used [0x00007fcb95000000,0x00007fcd43050000,0x00007fcd43050000)
  from space 4166400K, 0% used [0x00007fce39510000,0x00007fce39510000,0x00007fcf379d0000)
  to   space 4035328K, 0% used [0x00007fcd43050000,0x00007fcd43050000,0x00007fce39510000)
 PSOldGen        total 31457280K, used 31300002K [0x00007fc415000000, 0x00007fcb95000000, 0x00007fcb95000000)
  object space 31457280K, 99% used [0x00007fc415000000,0x00007fcb8b668b98,0x00007fcb95000000)
 PSPermGen       total 21248K, used 7487K [0x00007fc40aa00000, 0x00007fc40bec0000, 0x00007fc415000000)
  object space 21248K, 35% used [0x00007fc40aa00000,0x00007fc40b14ff48,0x00007fc40bec0000)

In particular, can anyone give a little insight on these heap values or suggest a good read about them?

Heap
 PSYoungGen      total 16348096K, used 12660905K [0x00007f833a560000, 0x00007f87639c0000, 0x00007f8765000000)
  eden space 15282432K, 82% used [0x00007f833a560000,0x00007f863f18a688,0x00007f86df1a0000)
  from space 1065664K, 0% used [0x00007f86df1a0000,0x00007f86df1a0000,0x00007f8720250000)
  to   space 1036288K, 0% used [0x00007f87245c0000,0x00007f87245c0000,0x00007f87639c0000)
 ParOldGen       total 34952576K, used 34903343K [0x00007f7ae5000000, 0x00007f833a560000, 0x00007f833a560000)
  object space 34952576K, 99% used [0x00007f7ae5000000,0x00007f833754bd90,0x00007f833a560000)
 PSPermGen       total 21248K, used 7130K [0x00007f7adfe00000, 0x00007f7ae12c0000, 0x00007f7ae5000000)
  object space 21248K, 33% used [0x00007f7adfe00000,0x00007f7ae04f6860,0x00007f7ae12c0000)

Answer

Jk1 picture Jk1 · Aug 10, 2013

It looks like your process is almost out of memory:

eden space 7045440K, 100% used
object space 31457280K, 99% used

In this case GC may run continiosly, trying to free some memory, consuming alomist all CPU. So application logic thread is under heavy starvation. You may either add more memory via -Xmx JVM option or profile the application. Profiling via JVisualVM, Jprofiler or other tool will give you important runtime data:

  • Memory allocaton dynamics. If memory consumption is increasing all the time you're likely to have a memory leak
  • Heap contents to find out what occupies the momory
  • GC statistics

Based on this data you'll certanly be able to resolve the issue.