I am running an application that creates and forgets large amounts of objects, the amount of long existing objects does grow slowly, but this is very little compared to short lived objects. This is a desktop application with high availability requirements, it needs to be turned on 24 hours per day. Most of the work is done on a single thread, this thread will just use all CPU it can get its hands.
In the past we have seen the following under heavy load:
The used heap space slowly goes up as the garbage collector collects less than the amount of memory newly allocated, the used heap size slowly grows and eventually comes near the specified max heap. At that point the garbage collector will kick in heavily and start using a huge amount of resources to prevent going over the max heap size. This slows the application down (easily 10x as slow) and at this point most of times the GC will succeed to clean up the garbage after a few minutes or fail and throw an OutOfMemoryException
, both of them are not really acceptable.
The hardware used is a quad core processor with at least 4GB of memory running 64 bit Linux, all of that we can use if needed. Currently the application is heavily using a single core, which is using most of its time running a single core/thread. The other cores are mostly idle and could be used for garbage collection.
I have a feeling the garbage collector should be collecting more aggressively at an early stage, well before it runs out of memory. Our application does not have any throughput issues, low pause time requirements are a bit more important than throughput, but far less important than not getting near the max heap size. It is acceptable if the single busy thread runs at only 75% of the current speed, as long as it means the garbage collector can keep up with the creation. So in short, a steady decrease of performance is better than the sudden drop we see now.
I have read Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning thoroughly, which means I understand the options well, however I still find it hard to chose the right settings as my requirements are a bit different from what is discussed in the paper.
Currently I am using the ParallelGC with the option -XX:GCTimeRatio=4
. This works a bit better than the default setting for time ratio, but I have a feeling the GC is allowed to run more by that setting than it does.
For monitoring I am using jconsole and jvisualvm mostly.
I would like to know what garbage collection options you recommend for the above situation. Also which GC debug output can I look at to understand the bottle neck better.
EDIT: I understand a very good option here is to create less garbage, this is something we are really considering, however I would like to know how we can tackle this with GC tuning, as that is something we can do much more easily and roll out more quickly than changing large amounts of the source code. Also I have ran the different memory profilers and I understand what the garbage is used by, and there by I know it consists of objects that could be collected.
I am using:
java version "1.6.0_27-ea"
Java(TM) SE Runtime Environment (build 1.6.0_27-ea-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b03, mixed mode)
With JVM parameters:
-Xmx1024M and -XX:GCTimeRatio=4
Edit in reply to Matts comments: Most memory (and cpu) goes towards constructing objects that represent the current situation. Some of these will be discarded right away as the situation changes rapidly, some others will have a medium life time if no updates come in for a while.
You don't mention which build of the JVM you're running, this is crucial info. You also don't mention how long the app tends to run for (e.g. is it for the length of a working day? a week? less?)
A few other points
-XX:+UseCompressedOops
otherwise you're simply wasting space as a 64bit JVM occupies ~1.5x the space of a 32bit JVM for the same workload without it (and if you're not, upgrade to get access to more RAM)You may also want to read another answer I've written on this subject which goes into sizing your survivor spaces & eden appropriately. Basically what you want to achieve is;
Therefore say you had a 6G heap, you might do something like 5G eden + 16M survivor spaces + a tenuring threshold of 1.
The basic process is
Therefore, given spaces appropriately sized for your application's allocation profile, it's perfectly possible to configure the system such that it handles the load nicely. A few caveats to this;
Points 1-3 mean this can take ages to get right. On the other hand you may be able to make it good enough v quickly, it depends how anal you are!
Finally, echoing Peter Lawrey's point, you can save a lot of bother (albeit introducing some other bother) if you are really rigorous about object allocation.