Tuning Heap And Garbage Collector
This page outlines some methods used to improve the performance of ingests using Fedora Commons by tuning the heap and the garbage collector. A full coverage of the topic in general is not possible due to the excessive amount of information available. Introductory documentation is available here and here. Please note that this article pertains solely to the mass-ingest in Fedora Commons. Other use cases may need different tuning strategies.
IntroductionFor server based applications it is very important that the heap is adequately sized. A heap too small will result in frequent garbage collections, or even OutOfMemoryErrors as a worst case scenario. A heap too big will prolong garbage collection times and, set to extremes, eventually force the operating system to start page-swapping which is likely to have a significant negative impact on performance.
The Java VM does set the heap size automatically, this feature is referred to as "Ergonomics". If the VM determines that the machine is a "server class" computer, the heap is sized as follows:
- initial heap size of 1/64 of physical memory up to 1Gbyte
- maximum heap size of 1/4 of physical memory up to 1Gbyte
Detecting a small heapFor many applications the default heap size is too small. There are several factors which suggest, when present, an increase of heap size. For example:
- Frequent garbage collections
- Generally slow application performance which cannot exclusively be attributed to other factors like IO or CPU.
The screenshot below is a snapshot of the heap during an ingest taken with Visual GC. The image shows an untuned heap. There are many garbage collections being run explicitly by Fedora Commons. The class responsible for the frequent collections is fedora.server.storage.DefaultDOManager. Before each commit it checks if there is at least 30% of the heap space available. If not, it triggers a full garbage collection. This situation is undesirable because full garbage collections are very expensive operations consuming a lot of execution time (also being referred to as "stop-the-world") and as a result the heap runs inefficiently.
Therefore, as a first step it is necessary to disable explicit garbage collections. This can be done by setting the following Java VM option:
The following image shows the result of disabling explicit garbage collections:
- The survivor spaces are being used
- Tenured space is not permanently full
- There are much less full garbage collections
There are still problems left. For one, there are still major garbage collections and old gen space fills up rather quickly. Furthermore eden space does still trigger frequent garbage collections.
This can be significantly improved by simply increasing the heap size.
The next image below shows the situation of the heap after an increase in space and some minor tuning. The eden space is now bigger and consequently gets filled up much slower until a minor gc is run, forming a triangular pattern. The tenured space (old gen) now rarely triggers a full garbage collection due to the low amount of objects being copied from the survivor spaces. Also, there is no more premature promotion to old gen space.
Finding the optimal heap sizeIn order to size the heap appropriately, it must be clear what the desired outcome should be. The typical heap profile for the ingest consists of many new objects, some survivors and very few old objects (somewhat more extreme than here). Therefore the goals for the ingest are:
- maximum throughput of gc and minimum gc time
- finding a balance between survivor ratio and tenuring threshold, meaning survivors should be kept in survivor space long enough so they likely die before reaching tenured space and at the same time aging objects shouldn't be kept in the survivor spaces too long
- prevent premature promotion
GC times were measured by adding the following options to the VM:
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:gc.log
ResultsThe test setup was as follows:
- 48,000 digital objects, external managed content
- Java 1.6.0_04, 64 bit
- Apache Tomcat 5.5
- Postgres 8.3 local (also MPT)
Several test runs were conducted using different heap sizes. Eventually a size somewhere between 756m and 1G was found to produce the best results. Bigger sizes than 1,2G slightly decreased performance. Less than 756m resulted in a markedly decreased performance.
-Xms1g -Xmx1g -XX:+DisableExplicitGC -XX:SurvivorRatio=10 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=30
Setting the initial heap size and maximum heap size equal had a measurable impact on ingest performance as this prevents the VM to resize the heap and forces the server to use all the allocated memory from startup. Not setting initial and maximum heap size equal resulted in a noticable performance degradation after some time due to a eden size set too small. It appears that the autoadjust feature sometimes sets the eden size too small.
The following four additional parameters are not absolutely necessary. They improve performance slightly in short test runs (50,000) but their long-term impact is not yet clear.
- -XX:+DisableExplicitGC is the safety precausion discussed previously.
- -XX:SurvivorRatio=10 increases the survivor spaces in order to keep survivors longer alive.
- -XX:TargetSurvivorRatio=90 increases the maximum percentage of available survivor space.
- -XX:MaxTenuringThreshold=30 keeps objects from being copied to tenured space too early.
The following image shows a comparison between an untuned heap and a tuned heap using the test setup described above. The improvement in ingest time is about 9%, however this should be seen as qualitative result only due to the high variance associated with this particular test case. Several test runs were conducted and improvements of about 5% to 10% (for some even more) were seen.
- It is also important to adjust the heap size of the ingesting client accordingly.
- The Java VM takes some "warm up time" until the measurements can be considered reliable. This is mainly due to the JIT compiler profiling, inlining, optimizing, compiling etc. (for visual see images).
- When tuning heap and gc some variance of the outcome is to be expected and therefore seen more qualitative rather than quantitative.
- Other factors seem secondary for this use case like using some of the the vast array of available configuration switches for the Java VM. They did not improve results and were actually likely to worsen the situation in many cases.
- For the ingest the throughput garbage collector was used. As it is the default choice for a server class machine, no additional tuning was required. Additional tuning of the throughput collector, even changing the garbage collector, did not improve ingest times.
- Tuning the heap (beyond adjusting it's size) and gc is highly dependent on the individual hardware/software combination as well as the underlying use case and may therefore in many cases be a futile excercise or even hurt performance due to the highly self-tuning Java VM (especially since Java 6).