Recently I was informed about some java.lang.OutOfMemoryError: Java heap space issue found in one of our application logs. Of course, this is one of the most shocking log entries for any developer.

To share some insight on how I handled and localized the issue, this post summarizes some basics to discover possible memory leaks using Eclipse Memory Analyzer.

Wasting time

After a quick search of our recent major commits, I couldn’t find any change that would cause a heap problem. Of course not, because we implement a pretty good review process that prevents any problems. So I’m pretty sure our code is always bug-free, as it is in any software development team. Certainly not!
Analyzing several of our monitoring and logging tools wasn’t very revealing either.

Decision making

After some internal meetings, we guessed that the problem must be some case of the hashmap, caching, or similar. Therefore we decided to need a heap dump to finally localize the cause of the problem.

Enable Heap Dump on Error

I enabled our JVM to store a heap dump in case of another memory error
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<file-or-dir-path>, and reduced the memory to force an out of memory in the foreseeable future.

After some time we got another out of memory error (of course without any downtime because of our application’s load-balanced failsafe deployment) and I was very enthusiastic to find the root cause based on the stored heap dump.

How to Analyze a Heap Dump

To analyze an application heap dump download the Eclipse Memory Analyzer (MAT) and open the heap dump .hprof file. In the appearing wizard, select Leak Suspects Report starting the analysis.

In our case, the Leak Suspects view of MAT already reported a problem concerning the heap usage of our ehcache. About 60% of our heap was used by our caching implementation.

Eclipse Memory Analyzer (MAT) – Leak Suspects View

I was relieved and shocked at the same moment.
Relieved because the first impression reveals exactly what suspected, and shocked because our small business application obviously caches that much data!?

Digging into

How to find out some detailed information about the heap problem?
Start with Dominator Tree and Histogram!

The Dominator Tree lists the biggest objects in the dump and is the most useful view of MAT because it provides a quite good overview to find a memory leak.
Expand each entry to see other objects referenced. This way the biggest memory consumer and also the class or object that causes the memory leak by preventing the garbage collector to reclaim the space can be discovered.

Digging into the Dominator Tree the applications cachekey of the memory eating cached objects already can be found.

Hint: To see the reference chain that keeps the class or object alive examine List objects with incoming / outgoing references or the Paths to GC Roots.

Besides the Dominator Tree, the Histogram view provides a list of Java objects by instance count and memory occupied grouped by their class. This way the classes that eat up your memory can be found easily.
Again incoming / outgoing references or the Paths to GC Roots help to drill down and reveal the cause of the memory consuming classes.

Same as in the Dominator Tree, ehcache describes the biggest part of the cake.

Summary

Drilling down to GC Roots and incoming / outgoing references confirmed that the root of our evil was some external API object (each instance about 1MB!) that was cached hundreds of times. So our ehcache consumed a significant part of the memory which ended up in an OutOfMemoryError.

Points to Keep in Mind

  1. Keep cool and don’t waste time.
  2. Organize a heap dump of your application.
  3. Import the heap dump into Eclipse Memory Analyzer (MAT)
  4. Use Dominator Tree and Histogram and drill down the biggest memory consumers.
  5. Locate and fix the issue.

Categories: DevOpsEngineering