Analyzing Java Problems – Tools, Snippets and Workflows

April 30th, 2018 by

When we need to investigate the cause for a dysfunctional Java application we have a plethora of tools available that on the one hand help us in gathering information, artifacts and statistics and on the other hand help us in processing this information and identifying possible problems.

The following list of tools, snippets, workflows and information about specific artifacts could provide a starting point for analyzing such problems and covers topics like heap-dumps, thread-dumps, heap-histograms, heap-regions, garbage-collection-logs, hotspot-compiler/codecache-logs, debugging native-memory, tools for heap-dump-analysis, JVM unified logging and more..

Java Analysis Word Cloud

Java Analysis Word Cloud

Contents

  1. Complete List of Forensic and Analysis Tools
  2. Thread Dumps / Analysis
    1. Create Thread Dumps (jstack, jcmd)
    2. Create Thread Dump by Killing the Application
    3. Analyze Thread Dumps with TDA
    4. Capture CPU Information and Thread-Dumps over Time
  3. Create HPROF Heap Dump
    1. HPROF Agent Profiling
    2. HPROF Features
    3. Dump with jmap
    4. Dump with jcmd
    5. Dump on OutOfMemoryError
  4. Create Heap Histogram
  5. Java Flight Recorder (JFR) Recording Runtime Information
    1. Start java app with commercial features unlocked and flight recorder activated:
    2. Check if JVM has commercial features activated
    3. Unlock commercial features
    4. Record for one minute
  6. jhat server
  7. JVM Memory Statistics
    1. JVM Memory Types
    2. Common Heap related Switches
      1. Heap Region Usage (jstat)
  8. Garbage Collection (GC)
    1. Garbage Collection Roots
    2. Garbage Collector Types
      1. Serial GC
      2. Parallel GC
      3. Concurrent Mark & Sweep GC (CMS)
      4. Garbage First (G1) GC
  9. Garbage Collection Logging
  10. Analyzing Garbage Collection
    1. Understanding the GC Log Format
    2. jtune
    3. GCPlot
    4. Universal GC Log Analyzer
    5. HPjmeter
    6. Garbage Collection Statistics Summary (jstat)
    7. Garbage Collection Behavior Statistics (jstat)
    8. Record GC Stats 20 times each 10 seconds (jstat)
  11. HotSpot / JIT Compiler
    1. Write JIT Compiler Output to Log File
    2. Compiler Flags Explained
    3. Print short Compilation Summary
    4. GUI Inspector (JITWatch)
  12. Code Cache Tuning
    1. Print System Defaults
  13. Native Memory
    1. Tracking Native Memory
    2. Print Native Memory Used by Process (jcmd)
    3. Detecting Native Memory Leaks
  14. Eclipse Memory Analyzer (MAT)
    1. Dominator Tree
    2. Shallow vs Retained Heap
    3. Garbage Collection Roots (gcroots)
    4. Histogram
    5. Leak Suspect Report
    6. Duplicate Classes Report
  15. Hypothesis Driven Workflows
    1. High CPU Usage
    2. Out of Memory Errors / High Memory Usage
  16. JVM Unified Logging
    1. Filtering JVM Logs
      1. Filtering by Tag
      2. Filtering by Log-Level
        1. Filtering by Tag and Log-Level
    2. Logging to STDOUT, STDERR or a File
    3. Configure Logging Information / Decoration
    4. Disable Logging
    5. Default Log Settings
    6. Show Usage Information
  17. Appendix A: Word Cloud
 

Complete List of Forensic and Analysis Tools

I have published a more detailed list including some additional snippets and tools (also non-Java) here: http://app.hascode.com/forensic/

I will update this list more frequently so please feel free to add the link above to your bookmarks if you like.

Thread Dumps / Analysis

A thread dump is a snapshot of the state of all threads of a process where each thread is
presented with his complete stacktrace showing its content.

Thread dumps may help in the diagnosis of our application’s threads and for detecting possible
deadlocks.

Create Thread Dumps (jstack, jcmd)

jstack PID > thread.dump
jcmd PID Thread.print > thread.dump

Create Thread Dump by Killing the Application

Another way to obtain heap dumps is to kill the Java ™ process with a SIGQUIT signal (3):

kill -3 PID

We may also use Ctrl + \ or Ctrl + Break depending on our operating system.

Analyze Thread Dumps with TDA

Thread Dump Analyzer is a GUI tool (swing) to analyze thread dumps.

May be obtained from GitHub.

java -jar tda.jar

Capture CPU Information and Thread-Dumps over Time

We’re creating cpu-usage-logs and thread-dumps each 10 seconds six times.

See Workflow: High CPU Usage for more detailed information how to use this information to detect CPU intensive threads.

APP_PID=8726
for i in $(seq 6); do top -b -H -p $APP_PID -n 1 > app_cpu_usage.`date +%s`.txt; jstack $APP_PID > app_threads.`date +%s`.txt; sleep 10; done
% ls -1
app_cpu_usage.1518772437.txt
app_cpu_usage.1518772448.txt
app_cpu_usage.1518772458.txt
app_cpu_usage.1518772469.txt
app_cpu_usage.1518772479.txt
app_cpu_usage.1518772489.txt
app_threads.1518772438.txt
app_threads.1518772448.txt
app_threads.1518772458.txt
app_threads.1518772469.txt
app_threads.1518772479.txt
app_threads.1518772489.txt

Create HPROF Heap Dump

HPROF Agent Profiling

Oracle ™ Documentation

java -agentlib:hprof Classname

HPROF Features

java -agentlib:hprof=help
 
     HPROF: Heap and CPU Profiling Agent (JVMTI Demonstration Code)
 
hprof usage: java -agentlib:hprof=[help]|[<option>=<value>, ...]
 
Option Name and Value  Description                    Default
---------------------  -----------                    -------
heap=dump|sites|all    heap profiling                 all
cpu=samples|times|old  CPU usage                      off
monitor=y|n            monitor contention             n
format=a|b             text(txt) or binary output     a
file=<file>            write data to file             java.hprof[{.txt}]
net=<host>:<port>      send data over a socket        off
depth=<size>           stack trace depth              4
interval=<ms>          sample interval in ms          10
cutoff=<value>         output cutoff point            0.0001
lineno=y|n             line number in traces?         y
thread=y|n             thread in traces?              n
doe=y|n                dump on exit?                  y
msa=y|n                Solaris micro state accounting n
force=y|n              force output to <file>         y
verbose=y|n            print messages about dumps     y
 
Obsolete Options
----------------
gc_okay=y|n
 
Examples
--------
  - Get sample cpu information every 20 millisec, with a stack depth of 3:
      java -agentlib:hprof=cpu=samples,interval=20,depth=3 classname
  - Get heap usage information based on the allocation sites:
      java -agentlib:hprof=heap=sites classname
 
Notes
-----
  - The option format=b cannot be used with monitor=y.
  - The option format=b cannot be used with cpu=old|times.
  - Use of the -Xrunhprof interface can still be used, e.g.
       java -Xrunhprof:[help]|[<option>=<value>, ...]
    will behave exactly the same as:
       java -agentlib:hprof=[help]|[<option>=<value>, ...]
 
Warnings
--------
  - This is demonstration code for the JVMTI interface and use of BCI,
    it is not an official product or formal part of the JDK.
  - The -Xrunhprof interface will be removed in a future release.
  - The option format=b is considered experimental, this format may change
    in a future release.

Dump with jmap

jmap -dump:format=b,file=/tmp/heap.hprof 2381

Dump with jcmd

jcmd <pid/main class> GC.heap_dump filename=heap.hprof

Dump on OutOfMemoryError

Start the Java application with the following parameter

-XX:+HeapDumpOnOutOfMemoryError

Create Heap Histogram

jcmd <pid/main class> GC.class_histogram > heap_histogram.txt
jmap -histo pid > heap_histogram.txt

Java Flight Recorder (JFR) Recording Runtime Information

More information about using JFR and Java Mission Control (JMC) here:

https://www.hascode.com/2017/10/java-mission-control-jmc-and-flight-recorder-jfr/#Activating_Commercial_Features_and_the_Java_Flight_Recorder

Start java app with commercial features unlocked and flight recorder activated:

-XX:+UnlockCommercialFeatures -XX:+FlightRecorder

Check if JVM has commercial features activated

jcmd PID VM.check_commercial_features

Unlock commercial features

jcmd PID VM.unlock_commercial_features

Record for one minute

jcmd PID JFR.start duration=60s filename=one-minute-live-recording.jfr

jhat server

Using jhat with large heap dumps consumes huge amounts of memory.
Use Eclipse Memory Analyzer (MAT) instead, more features, less memory consumption.
jhat -port 7401 -J-Xmx4G heap.bin

JVM Memory Statistics

JVM Memory Types

source: Joe Kutner

Heap
Class instantiations/objects are stored here

Thread stacks
A thread has its own call stack. Primitive local variables and object references are stored there
along with the call stack (list of method invocations) itself. The stack is cleaned up as stack frames move out of context so there is no GC performed here.

Metaspace
Class definitions of the objects is stored here

Code cache
Stores native code from the JIT compiler

Buffer pools
Allocated buffers outside of the heap (often used by libraries for performance reasons).
Buffer pools can be used to share memory between Java code and native code, or map regions of a file into memory.

OS memory
OS keeps heap and stacks for a Java process independent of the heap and stacks managed by the JVM itself.
A small amount of memory is also consumed for each native library loaded.

Common Heap related Switches

Parameter Description
-Xms Initial heap size for when the JVM starts.
-Xmx Maximum heap size.
-Xmn Size of the Young Generation.
-XX:PermSize Starting size of the Permanent Generation.
-XX:MaxPermSize Maximum size of the Permanent Generation
Heap Region Usage (jstat)
jstat -gcutil ID

Garbage Collection (GC)

Reading Recommendation
A lot of in-depth articles can be found in Plumbr’s Garbage Collection Handbook.

Garbage Collection Roots

Garbage Collection Roots (GCRoots) are the starting point for the Garbage Collector during the Mark phase to detect reachable objects.

GCRoots
  • Local variables
  • Active threads
  • Static fields
  • JNI references

In the Sweep phase the Garbage Collector makes sure that the space used by non-reachable objects is freed.

Afterwards some Garbage Collectors may run a compact phase to optimize the heap structures.

Garbage Collector Types

More detailed information: Plumbr – GC Algorithms: Implementations

Serial GC
  • mark-copy for young-generation, mark-sweep-compact for old generation
  • simplest GC, uses only one thread
  • freezes all application threads when running
  • don’t use it, especially not for multi-threaded environments

Enable this garbage collector by adding the following vm parameter:

-XX:+UseSerialGC
Parallel GC
  • mark-copy for young generation, mark-sweep-compact for old generation
  • young and old collections trigger stop-the-world-events
  • collectors use multiple threads to mark-copy/mark-sweep-compact

The Parallel GC is activated by adding the following vm parameter:

-XX:+UseParallelGC
Concurrent Mark & Sweep GC (CMS)

The Concurrent Mark Sweeep GC is activated by adding the following vm parameter:

  • parallel stop-the-world mark-copy for young-generation, mostly-concurrent mark-sweep for old generation
  • designed to avoid long pauses when collecting the old generation, it does not compact old gen but
    uses free lists to manage reclaimed space
-XX:+UseConcMarkSweepGC
Garbage First (G1) GC
  • aims to make duration and distribution of stop-the-world-pauses predictable and configurable
  • heap is split into a number of smaller heap regions where each region may be an Eden, Survivor or
    Old region
  • all Eden+Survivor regions = Young Generation, all Old regions = Old Generation

Activate this Garbage Collector by adding the following vm parameter:

-XX:+UseG1GC

Garbage Collection Logging

By adding the following JVM parameters:

-verbose:gc or the -Xloggc

The following additional parameters add helpful timestamps and more detailed information..

-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps

In addition we may configure how to write the gc log files .. e.g.:

 -Xloggc:/path/to/gc-stats.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M -XX:-PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:-PrintTenuringDistribution

Analyzing Garbage Collection

Analyzing the garbage collection (GC) logs can give us some details e.g.:

  • When did full GCs run?
  • When did scavenge or young generation GCs run?
  • How many scavenge and gulls GCs ran (in which interval)?
  • After GC, how much memory was reclaimed in Young and Old generations?
  • How long did GC run?
  • What was the total allocated memory in each generation?

Understanding the GC Log Format

Allocation Failure

Log entries stating a “GC (Allocation Failure)” do not necessarily identify an error but state that no more space is left in
Eden to allocate objects and therefore it’s a cause of a normal GC Cycle to kick.

Older JMVs did not log the cause of minor GC cycles so that the log files may differ between JVM versions.

Sample Garbage Collector Log-File

Java HotSpot(TM) 64-Bit Server VM (25.151-b12) for linux-amd64 JRE (1.8.0_151-b12), built on Sep  5 2017 19:20:58 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32598036k(20587788k free), swap 33205756k(33205756k free)
CommandLine flags: -XX:InitialHeapSize=521568576 -XX:MaxHeapSize=8345097216 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPo
inters -XX:+UseCompressedOops -XX:+UseParallelGC
2018-03-21T10:39:18.793+0100: 650,456: [GC (Allocation Failure) [PSYoungGen: 128000K->7681K(148992K)] 128000K->7697K(488960K), 0,0073382 secs] [Times: user=0,02 sys=0,00, real=0,01
 secs]
2018-03-21T10:39:20.036+0100: 651,699: [GC (Allocation Failure) [PSYoungGen: 135681K->3936K(148992K)] 135697K->3960K(488960K), 0,0055688 secs] [Times: user=0,03 sys=0,00, real=0,01
 secs]
2018-03-21T10:39:21.021+0100: 652,684: [GC (Allocation Failure) [PSYoungGen: 131936K->4000K(148992K)] 131960K->4032K(488960K), 0,0043573 secs] [Times: user=0,02 sys=0,00, real=0,00
 secs]
2018-03-21T10:39:21.979+0100: 653,642: [GC (Allocation Failure) [PSYoungGen: 132000K->4000K(148992K)] 132032K->4040K(488960K), 0,0045307 secs] [Times: user=0,02 sys=0,00, real=0,01
 secs]
2018-03-21T10:39:22.958+0100: 654,621: [GC (Allocation Failure) [PSYoungGen: 132000K->4064K(129024K)] 132040K->4112K(468992K), 0,0046371 secs] [Times: user=0,02 sys=0,00, real=0,01
 secs]
2018-03-21T10:39:23.874+0100: 655,537: [GC (Allocation Failure) [PSYoungGen: 128992K->4032K(125952K)] 129040K->4088K(465920K), 0,0053952 secs] [Times: user=0,01 sys=0,00, real=0,00
 secs]
 [..]

Inspecting a single line, we may identify the different parts of the log file.

2018-03-21T10:39:18.793+0100: 650,456: [GC (Allocation Failure) [PSYoungGen: 128000K->7681K(148992K)] 128000K->7697K(488960K), 0,0073382 secs] [Times: user=0,02 sys=0,00, real=0,01
 secs]
  • 2018-03-21T10:39:18.793+0100: The GC event’s start time.
  • 650,456: The GC event’s start time relative to the JVM startup time.
  • GC Distinguishing flag between minor and full GC. In this case, it’s a minor GC (otherwise it would have been Full GC)
  • Allocation Failure: The cause of the collection. In this case, the young gen has no more space to allocate objects.
  • PSYoungGen: The target of the GC, the Young Generation..
  • 128000K→7681K: The space reduction of the Young Space after the GC.
  • (148992K): The Young Generation’s total size.
  • 128000K→7697K: Total heap before and after GC
  • (488960K): Total available heap
  • 0,0073382 secs: Duration of GC event in seconds
  • [Times: user=0,02 sys=0,00, real=0,01secs]: The GC event’s duration, measured as:
    • user: Amount of CPU time spent in user-mode code (outside the kernel)
    • sys: Time spent in OS/Kernel calls or waiting for system event
    • real: Wall clock time – time from start to finish (all elapsed time including time slices used by other processes and time spent blocked e.g. waiting for IO)

More details (Nikita Salnikov-Tarnovski)

jtune

A Python based CLI tool to analyze GC log files or GCs at runtime.

jtune on GitHub

pip install jtune
 
jtune -c 40 -p 25815

GCPlot

Is basically a web application that may be run in the local environment e.g. using Docker.

GCPlot Website

docker run -d -p 80:80 gcplot/gcplot

Afterwards we may access GCPlot as a web application running on http://localhost.

Username and password is admin.

Universal GC Log Analyzer

An online tool that allows to analyze uploaded garbage collection logs.

HPjmeter

A feature-rich analyzing tool from Hewlett Packard.

You need to have/create a HPE Passport account to download this tool.

Garbage Collection Statistics Summary (jstat)

jstat -gcutil PID

Garbage Collection Behavior Statistics (jstat)

jstat -gc ID

Record GC Stats 20 times each 10 seconds (jstat)

jstat -gcutil -t PID 10s 20

HotSpot / JIT Compiler

Write JIT Compiler Output to Log File

java -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation -XX:LogFile=jit.log com.hascode.MyClass

Compiler Flags Explained

b    Blocking compiler (always set for client)
*    Generating a native wrapper
%    On stack replacement (where the compiled code is running)
!    Method has exception handlers
s    Method declared as synchronized
n    Method declared as native
made non entrant    compilation was wrong/incomplete, no future callers will use this version
made zombie         code is not in use and ready for GC

Print short Compilation Summary

jstat -printcompilation pid

GUI Inspector (JITWatch)

Allows to inspect JIT compilations and has a sandbox feature to test code snippets for JIT optimization.

  • Download the source from GitHub
  • Run
    mvn clean compile test exec:java

Code Cache Tuning

Possible problem might be indicated by error log entries (e.g. catalina.out) stating that the ‘CodeCache is full’ and
the ‘compiler has been disabled’.

Example:

Java HotSpot(TM) 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled.
Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=
CodeCache: size=245760Kb used=243892Kb max_used=243931Kb free=1867Kb
 bounds [0x00007faa3b170000, 0x00007faa4a170000, 0x00007faa4a170000]
 total_blobs=45719 nmethods=44096 adapters=1528
 compilation: disabled (not enough contiguous free space left)
Solution is to assign more space for the code cache and to activate flushing. This may be done
using -XX:ReservedCodeCacheSize and -XX:+UseCodeCacheFlushing.

JVM Parameters:

-XX:ReservedCodeCacheSize=384m -XX:+UseCodeCacheFlushing

Print System Defaults

To find the default size of the code cache and if flushing is enabled.

% java -XX:+PrintFlagsFinal | grep CodeCache
    uintx CodeCacheExpansionSize                    = 65536                               {pd product}
    uintx CodeCacheMinimumFreeSpace                 = 512000                              {product}
    uintx InitialCodeCacheSize                      = 2555904                             {pd product}
     bool PrintCodeCache                            = false                               {product}
     bool PrintCodeCacheOnCompilation               = false                               {product}
    uintx ReservedCodeCacheSize                     = 251658240                           {pd product}
     bool UseCodeCacheFlushing                      = true                                {product}

Native Memory

Tracking Native Memory

Activate the JVM native memory tracking feature:

-XX:NativeMemoryTracking=detail -XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics

Print Native Memory Used by Process (jcmd)

jcmd <PID> VM.native_memory summary
14321:
 
Native Memory Tracking:
 
Total: reserved=9861214KB, committed=902706KB
-                 Java Heap (reserved=8151040KB, committed=509952KB)
                            (mmap: reserved=8151040KB, committed=509952KB)
 
-                     Class (reserved=1070349KB, committed=23565KB)
                            (classes #2382)
                            (malloc=9485KB #1932)
                            (mmap: reserved=1060864KB, committed=14080KB)
 
-                    Thread (reserved=49556KB, committed=49556KB)
                            (thread #49)
                            (stack: reserved=49344KB, committed=49344KB)
                            (malloc=155KB #250)
                            (arena=56KB #96)
 
-                      Code (reserved=250177KB, committed=4413KB)
                            (malloc=577KB #1505)
                            (mmap: reserved=249600KB, committed=3836KB)
 
-                        GC (reserved=308187KB, committed=283315KB)
                            (malloc=10383KB #135)
                            (mmap: reserved=297804KB, committed=272932KB)
 
-                  Compiler (reserved=148KB, committed=148KB)
                            (malloc=18KB #78)
                            (arena=131KB #3)
 
-                  Internal (reserved=27045KB, committed=27045KB)
                            (malloc=27013KB #4122)
                            (mmap: reserved=32KB, committed=32KB)
 
-                    Symbol (reserved=3987KB, committed=3987KB)
                            (malloc=2764KB #16693)
                            (arena=1223KB #1)
 
-    Native Memory Tracking (reserved=529KB, committed=529KB)
                            (malloc=112KB #1769)
                            (tracking overhead=416KB)
 
-               Arena Chunk (reserved=196KB, committed=196KB)
                            (malloc=196KB)
jcmd does not capture direct or mapped buffer pools.

Detecting Native Memory Leaks

  • Start the JVM with summary or detail tracking enabled:
    -XX:NativeMemoryTracking=summary

    or

    -XX:NativeMemoryTracking=detail
  • Establish a baseline used for later for comparison
    jcmd <pid> VM.native_memory baseline
  • Monitor memory changes with
jcmd <PID> VM.native_memory detail.diff
<PID>:
 
Native Memory Tracking:
 
Total: reserved=9863326KB +18KB, committed=903286KB +82KB
 
-                 Java Heap (reserved=8151040KB, committed=509952KB)
                            (mmap: reserved=8151040KB, committed=509952KB)
 
-                     Class (reserved=1072398KB, committed=23822KB)
                            (classes #2383 +1)
                            (malloc=9486KB #1954 -19)
                            (mmap: reserved=1062912KB, committed=14336KB)
 
-                    Thread (reserved=49556KB, committed=49556KB)
                            (thread #49)
                            (stack: reserved=49344KB, committed=49344KB)
                            (malloc=155KB #250)
                            (arena=56KB #96)
 
-                      Code (reserved=250201KB -8KB, committed=4697KB +56KB)
                            (malloc=601KB -8KB #1523 -16)
                            (mmap: reserved=249600KB, committed=4096KB +64KB)
 
-                        GC (reserved=308187KB, committed=283315KB)
                            (malloc=10383KB #135)
                            (mmap: reserved=297804KB, committed=272932KB)
 
-                  Compiler (reserved=148KB, committed=148KB)
                            (malloc=17KB #75)
                            (arena=131KB #3)
 
-                  Internal (reserved=27047KB +1KB, committed=27047KB +1KB)
                            (malloc=27015KB +1KB #4127 +4)
                            (mmap: reserved=32KB, committed=32KB)
 
-                    Symbol (reserved=3995KB, committed=3995KB)
                            (malloc=2772KB #16694)
                            (arena=1223KB #1)
 
-    Native Memory Tracking (reserved=558KB +26KB, committed=558KB +26KB)
                            (malloc=136KB +21KB #2107 +309)
                            (tracking overhead=422KB +4KB)
 
-               Arena Chunk (reserved=197KB, committed=197KB)
                            (malloc=197KB)
 
[0x00007f97eb193f05] CodeBlob::set_oop_maps(OopMapSet*)+0xe5
[0x00007f97eb19444b] CodeBlob::CodeBlob(char const*, CodeBuffer*, int, int, int, int, OopMapSet*)+0xcb
[0x00007f97eb6069ee] nmethod::nmethod(Method*, int, int, int, CodeOffsets*, int, DebugInformationRecorder*, Dependencies*, CodeBuffer*, int, OopMapSet*, ExceptionHandlerTable*, ImplicitExceptionTable*, AbstractCompiler*, int)+0x4e
[0x00007f97eb607389] nmethod::new_nmethod(methodHandle, int, int, CodeOffsets*, int, DebugInformationRecorder*, Dependencies*, CodeBuffer*, int, OopMapSet*, ExceptionHandlerTable*, ImplicitExceptionTable*, AbstractCompiler*, int)+0x289
                             (malloc=551KB -8KB #1004 -16)
 
[0x00007f97eb580a50] MallocSiteTable::new_entry(NativeCallStack const&)+0x0
[0x00007f97eb580e20] MallocSiteTable::lookup_or_add(NativeCallStack const&, unsigned long*, unsigned long*)+0x0
[0x00007f97eb581080] MallocSiteTable::allocation_at(NativeCallStack const&, unsigned long, unsigned long*, unsigned long*)+0x0
                             (malloc=109KB +3KB #1749 +42)
 
[0x00007f97eb59ca53] MemBaseline::aggregate_virtual_memory_allocation_sites()+0x173
[0x00007f97eb59ccd8] MemBaseline::baseline_allocation_sites()+0x1d8
[0x00007f97eb59d355] MemBaseline::baseline(bool)+0x635
[0x00007f97eb60cd98] NMTDCmd::execute(DCmdSource, Thread*)+0x178
                             (malloc=1KB +1KB #17 +17)
 
[..]

Eclipse Memory Analyzer (MAT)

Reads in HPROF heap dumps and stores them in an index. In contrast to other tools this fact avoids the allocation of (n * heap-dump-size)
memory in the RAM which is significant when trying to process large heap dumps.

MAT Project/Download Site
https://www.eclipse.org/mat/

Dominator Tree

Dominator
An object x dominates an object y if every path in the object graph from the start/root node to y must pass through x.

Source: Eclipse Help

Memory Analyzer provides a dominator tree of the object graph. The transformation of the object
reference graph into a dominator tree allows you to easily identify the biggest chunks of retained memory and the keep-alive dependencies among objects. Bellow is an informal definition of the terms.

Shallow vs Retained Heap

Source: Eclipse Help

Shallow heap is the memory consumed by one object. An object needs 32 or 64 bits (depending on the OS architecture) per reference, 4 bytes per Integer, 8 bytes per Long, etc. Depending on the heap dump format the size may be adjusted (e.g. aligned to 8, etc…​) to model better the real consumption of the VM.

Retained set of X is the set of objects which would be removed by GC when X is garbage collected.

Retained heap of X is the sum of shallow sizes of all objects in the retained set of X, i.e. memory kept alive by X.

Generally speaking, shallow heap of an object is its size in the heap and retained size of the same object is the amount of heap memory that will be freed when the object is garbage collected.

Garbage Collection Roots (gcroots)

Gcroots are objects accessible from outside the heap – these may be ..

Source: Eclipse Help

System Class
Class loaded by bootstrap/system class loader. For example, everything from the rt.jar like java.util.* .

JNI Local
Local variable in native code, such as user defined JNI code or JVM internal code.

JNI Global
Global variable in native code, such as user defined JNI code or JVM internal code.

Thread Block
Object referred to from a currently active thread block.

Thread
A started, but not stopped, thread.

Busy Monitor
Everything that has called wait() or notify() or that is synchronized. For example, by calling synchronized(Object) or by entering a synchronized method. Static method means class, non-static method means object.

Java Local
Local variable. For example, input parameters or locally created objects of methods that are still in the stack of a thread.

Native Stack
In or out parameters in native code, such as user defined JNI code or JVM internal code. This is often the case as many methods have native parts and the objects handled as method parameters become GC roots. For example, parameters used for file/network I/O methods or reflection.

Finalizable
An object which is in a queue awaiting its finalizer to be run.

Unfinalized
An object which has a finalize method, but has not been finalized and is not yet on the finalizer queue.

Unreachable
An object which is unreachable from any other root, but has been marked as a root by MAT to retain objects which otherwise would not be included in the analysis.

Java Stack Frame
A Java stack frame, holding local variables. Only generated when the dump is parsed with the preference set to treat Java stack frames as objects.

Unknown
An object of unknown root type. Some dumps, such as IBM Portable Heap Dump files, do not have root information. For these dumps the MAT parser marks objects which are have no inbound references or are unreachable from any other root as roots of this type. This ensures that MAT retains all the objects in the dump.

Histogram

The histogram shows the number of instances of a particular class and its memory consumption. We may group the view’s entries by classloader
or package (and filter by package) to gain more insights.

When observing the histogram, native structures like String, Object[] and char are seldom the problem’s indicator but large collections or
custom classes may be.

Leak Suspect Report

The leak suspect report may help identifying possible memory leaks as it scans the heap dump for large memory chunks and checks
what keeps these chunks alive.

It also generates some nice pie charts ;)

Duplicate Classes Report

Error may occur if an application loads different version of the same library.

The Duplicate Classes Report searches for classes existing multiple times in the heap and lists them with their corresponding classloader.

Hypothesis Driven Workflows

In case of a resource exhaustion on the affected system, we need to track down what is affected (RAM, disk-space, IO-Handle-Limits, etc..)
and then to investigate how our Java application causes or affects this situation.

High CPU Usage

The following workflow allows us to identify the Java threads causing high CPU usage.

  • First of all we need to know which Java process has high CPU time:The following command lists processes, sorts them by their CPU time and greps for lines containing java:
    ps -eLf --sort -pcpu|grep java
  • Now we’ve got the pid of the CPU using process, let’s say it is 123456.We now want to know which thread of this process uses many CPU resources.
  • One way to look at a process’ threads is using ps like this:
    ps -Lp 123456

    Using top is an option, too:

    top -H -p 123456
  • Now we should have one or more pids of the suspicious threads.
  • If we haven’t take a thread dump before, we should do this now using jstack:
    jstack PID > thread.dump

    A better alternative is using jcmd:

    jcmd PID Thread.print > thread.dump

    To record some thread-dumps and cpu-logs over time, I have added a little bash script here: Capture CPU Information and Thread-Dumps over Time.

  • Now we need to convert the suspicious thread’s PID to the hex-format so that we can search
    for these threads in the generated thread dumps.

    The following commands converts the PID to the hex-format.

    % printf 0x%x 8726
    0x2216

    With this hex-id we can now search in the thread-dumps and identify the exact threads.

Out of Memory Errors / High Memory Usage

  • First of all we need to create a HPROF heap dump. For more information how to create these dumps, see “Create HPROF Heap Dump”.

There are different tools to read in the generated heap dump, but many of these tools require the 1.5x size of the dump of memory.
This may be a problem for larger heap dumps that exceed your workstation’s RAM.

Common open-source tools for heap dump analysis are:

  • visualvm
  • jmc(Java Mission Control)
  • Eclipse Memory Analyzer (MAT)

The Eclipse Memory Analyzer (MAT) has the advantage of reading in the heap information in an internal index
and therefore does not consume as much RAM as other tools and may be used for large heap dumps.

Please refer to the chapter “Eclipse Memory Analyzer (MAT)” for more detailed information about its capabilities and reports.

JVM Unified Logging

With Java 9, JEP-158 added unified logging for the Java ™ Virtual Machine.

By adding a vm parameter this allows us to specify ..

  • what to log: messages may be selected by tag or log-level
  • the log information: the output may be configured to include information like process-ids, timestamps
  • where to log: writing the output STDOUT, STDERR or a file

Unified Logging is activated by adding the following JVM parameter:

-Xlog

More detailed information in general can be found in Oracle’s Java 9 documentation.

Filtering JVM Logs

The following shortened output gives a good example for the different event labels and log-levels.

$ java -Xlog -version
[0.006s][info][os] SafePoint Polling address: 0x00007f3f9bcf1000
[0.006s][info][os] Memory Serialize Page address: 0x00007f3f9bcf0000
[0.007s][info][os,thread] Capturing initial stack in user thread: req. size: 1024K, actual size: 1024K, top=0x00007ffd552fc000, bottom=0x00007ffd551fc000
[0.007s][info][os       ] HotSpot is running with glibc 2.23, NPTL 2.23
[0.007s][info][biasedlocking] Aligned thread 0x00007f3f94013550 to 0x00007f3f94013800
[0.007s][info][os,thread    ] Thread attached (tid: 2571, pthread id: 139911173547776).
[0.008s][info][class,path   ] bootstrap loader class path=/data/app/jdk-9/lib/modules
[0.008s][info][class,path   ] classpath:
[0.008s][info][class,path   ] opened: /data/app/jdk-9/lib/modules
[0.008s][info][class,load   ] opened: /data/app/jdk-9/lib/modules
[0.008s][info][pagesize     ] CodeHeap 'non-nmethods':  min=2496K max=5700K base=0x00007f3f7ba35000 page_size=4K size=5700K
[0.008s][info][pagesize     ] CodeHeap 'profiled nmethods':  min=2496K max=120028K base=0x00007f3f7bfc6000 page_size=4K size=120028K
[0.008s][info][pagesize     ] CodeHeap 'non-profiled nmethods':  min=2496K max=120032K base=0x00007f3f834fd000 page_size=4K size=120032K
[0.008s][info][os,cpu       ] CPU:total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 58 stepping 9, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, aes, clmul, erms, ht, tsc, tscinvbit, tscinv
[0.025s][info][os,cpu       ] CPU Model and flags from /proc/cpuinfo:
[0.025s][info][os,cpu       ] model name        : Intel(R) Core(TM) i7-3820QM CPU @ 2.70GHz
[0.026s][info][os,cpu       ] flags             : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti retpoline spec_ctrl tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
[0.026s][info][startuptime  ] StubRoutines generation 1, 0.0002880 secs
[0.026s][info][gc,heap      ] Heap region size: 2M
[..]
[0.055s][info][biasedlocking] Aligned thread 0x00007f3f942492f0 to 0x00007f3f94249800
[0.055s][info][os,thread    ] Thread started (pthread id: 139908884715264, attributes: stacksize: 1024k, guardsize: 0k, detached).
[0.055s][info][os,thread    ] Thread is alive (tid: 2595, pthread id: 139908884715264).
[..]
[0,102s][info][module,load  ] java.xml location: jrt:/java.xml
[..]
[0,106s][info][class,load   ] java.util.concurrent.CopyOnWriteArrayList source: jrt:/java.base
[0,107s][info][class,init   ] 360 Initializing 'java/util/concurrent/CopyOnWriteArrayList'(no method) (0x00000007c004fa70)
[..]
[0,110s][info][safepoint,cleanup ] purging class loader data graph, 0,0000001 secs
[0,110s][info][os,thread         ] Thread finished (tid: 2593, pthread id: 139908886820608).
Filtering by Tag

Using the example above we now would like to filter by os tagged jvm messages.

Available tags are
  • add
  • age
  • alloc
  • annotation
  • aot
  • arguments
  • attach
  • barrier
  • biasedlocking
  • blocks
  • bot
  • breakpoint
  • bytecode
  • census
  • class
  • classhisto
  • cleanup
  • compaction
  • comparator
  • constraints
  • constantpool
  • coops
  • cpu
  • cset
  • data
  • defaultmethods
  • dump
  • ergo
  • event
  • exceptions
  • exit
  • fingerprint
  • freelist
  • gc
  • hashtables
  • heap
  • humongous
  • ihop
  • iklass
  • init
  • itables
  • jfr
  • jni
  • jvmti
  • liveness
  • load
  • loader
  • logging
  • mark
  • marking
  • metadata
  • metaspace
  • method
  • mmu
  • modules
  • monitorinflation
  • monitormismatch
  • nmethod
  • normalize
  • objecttagging
  • obsolete
  • oopmap
  • os
  • pagesize
  • parser
  • patch
  • path
  • phases
  • plab
  • preorder
  • promotion
  • protectiondomain
  • purge
  • redefine
  • ref
  • refine
  • region
  • remset
  • resolve
  • safepoint
  • scavenge
  • scrub
  • setting
  • stackmap
  • stacktrace
  • stackwalk
  • start
  • startuptime
  • state
  • stats
  • stringdedup
  • stringtable
  • subclass
  • survivor
  • sweep
  • system
  • task
  • thread
  • time
  • timer
  • tlab
  • unload
  • update
  • verification
  • verify
  • vmoperation
  • vtables
  • workgang

We may specify one or more tags by adding them to the JVM parameter like this:

% java -Xlog:os -version
[0.007s][info][os] SafePoint Polling address: 0x00007fe6f1d9d000
[0.007s][info][os] Memory Serialize Page address: 0x00007fe6f1d9c000
[0.007s][info][os] HotSpot is running with glibc 2.23, NPTL 2.23
openjdk version "9"
OpenJDK Runtime Environment (build 9+181)
OpenJDK 64-Bit Server VM (build 9+181, mixed mode)

Another example using multiple tags, os and cpu:

% java -Xlog:os+cpu -version
[0.009s][info][os,cpu] CPU:total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 58 stepping 9, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, aes, clmul, erms, ht, tsc, tscinvbit, tscinv
[0.026s][info][os,cpu] CPU Model and flags from /proc/cpuinfo:
[0.026s][info][os,cpu] model name       : Intel(R) Core(TM) i7-3820QM CPU @ 2.70GHz
[0.027s][info][os,cpu] flags            : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti retpoline spec_ctrl tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
openjdk version "9"
OpenJDK Runtime Environment (build 9+181)
OpenJDK 64-Bit Server VM (build 9+181, mixed mode)
The tags specified as a filter must exactly match the tags assigned to the message!

Therefore -Xlog:os might result in a different output than -Xlog:os+cpu.

To show every message tagged with “os“, we may use the following wildcard expression: -Xlog:os*

Several selectors may be applied, separated by a comma:

% java -Xlog:os,os+cpu -version
[0.008s][info][os] SafePoint Polling address: 0x00007f823b8e6000
[0.008s][info][os] Memory Serialize Page address: 0x00007f823b8e5000
[0.008s][info][os] HotSpot is running with glibc 2.23, NPTL 2.23
[0.011s][info][os,cpu] CPU:total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 58 stepping 9, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, aes, clmul, erms, ht, tsc, tscinvbit, tscinv
[0.030s][info][os,cpu] CPU Model and flags from /proc/cpuinfo:
[0.030s][info][os,cpu] model name       : Intel(R) Core(TM) i7-3820QM CPU @ 2.70GHz
[0.030s][info][os,cpu] flags            : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti retpoline spec_ctrl tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
openjdk version "9"
OpenJDK Runtime Environment (build 9+181)
OpenJDK 64-Bit Server VM (build 9+181, mixed mode)
Filtering by Log-Level

We may specify the designated log-level for filtering messages.

Available log-levels are
  • info
  • debug
  • off
  • trace
  • warning
  • error

The first example only displays messages that are warnings – luckily there are no warnings ;)

% java -Xlog:all=warning --version
openjdk 9
OpenJDK Runtime Environment (build 9+181)
OpenJDK 64-Bit Server VM (build 9+181, mixed mode)

The following example displays all debug messages:

% java -Xlog:all=debug
[0.001s][debug][os] Initial active processor count set to 8
[0.006s][info ][os] SafePoint Polling address: 0x00007fe6713f0000
[0.006s][info ][os] Memory Serialize Page address: 0x00007fe6713ef000
[0.007s][info ][os,thread] Capturing initial stack in user thread: req. size: 1024K, actual size: 1024K, top=0x00007fff13978000, bottom=0x00007fff13878000
[0.007s][info ][os       ] HotSpot is running with glibc 2.23, NPTL 2.23
[..]
Filtering by Tag and Log-Level

We may specify the designated log-level for specific tagged messages.

In the following example, we’re showing only messages tagged with os and having a log-level debug:

% java -Xlog:os=debug --version                                                                                                                                                 1[0.001s][debug][os] Initial active processor count set to 8
[0.007s][info ][os] SafePoint Polling address: 0x00007fe79010b000
[0.007s][info ][os] Memory Serialize Page address: 0x00007fe79010a000
[0.007s][info ][os] HotSpot is running with glibc 2.23, NPTL 2.23
openjdk 9
OpenJDK Runtime Environment (build 9+181)
OpenJDK 64-Bit Server VM (build 9+181, mixed mode)

Logging to STDOUT, STDERR or a File

We may specify where to log by appending one of the possible three locations to the parameter list preceded by colon:

  • stdout: wrinting to STDOUT
  • stderr: writing to STDERR
  • file=/path/to/file: writing to the file /path/to/file

The following example logs all warnings to STDERR:

% java -Xlog:all=warning:stderr --version

In this example, all os tagged debug messages are written to a log file /tmp/jvm-debug.log:

% java -Xlog:os=debug:file=/tmp/jvm-debug.log --version
openjdk 9
OpenJDK Runtime Environment (build 9+181)
OpenJDK 64-Bit Server VM (build 9+181, mixed mode)

Afterwards we should be able to read the log-file created:

% cat /tmp/jvm-debug.log
[0.001s][debug][os] Initial active processor count set to 8
[0.006s][info ][os] SafePoint Polling address: 0x00007f93e1460000
[0.006s][info ][os] Memory Serialize Page address: 0x00007f93e145f000
[0.006s][info ][os] HotSpot is running with glibc 2.23, NPTL 2.23

Configure Logging Information / Decoration

We may specify how to decorate our log messages. Therefore one ore more of the following flags may be applied to add useful information to our logs:

Source: Oracle.com – Java Platform, Standard Edition Tools Reference

Flag Description
time or t Current time and date in ISO-8601 format.
utctime or utc Universal Time Coordinated or Coordinated Universal Time.
uptime or Time since the start of the JVM in seconds and milliseconds. For example, 6.567s.
timemillis or tm The same value as generated by System.currentTimeMillis().
uptimemillis or um Milliseconds since the JVM started.
timenanos or tn The same value generated by System.nanoTime().
uptimenanos or un Nanoseconds since the JVM started.
hostname or hn The host name.
pid or The process identifier.
tid or ti The thread identifier.
level or The level associated with the log message.
tags or tg The tag-set associated with the log message.

In the following example, we want to display the time and date, the process-id, the thread-id,
the log-level and the tags for each log message that is tagged with os and has log-level debug:

% java -Xlog:os=debug:stdout:time,pid,tid,level,tags --version
[2018-04-27T10:15:10.874+0200][16423][16424][debug][os] Initial active processor count set to 8
[2018-04-27T10:15:10.879+0200][16423][16424][info ][os] SafePoint Polling address: 0x00007f0379b60000
[2018-04-27T10:15:10.879+0200][16423][16424][info ][os] Memory Serialize Page address: 0x00007f0379b5f000
[2018-04-27T10:15:10.879+0200][16423][16424][info ][os] HotSpot is running with glibc 2.23, NPTL 2.23
openjdk 9
OpenJDK Runtime Environment (build 9+181)
OpenJDK 64-Bit Server VM (build 9+181, mixed mode)

Disable Logging

We may disable logging and reset filters etc with -Xlog:disable:

Example:

% java -Xlog:disable --version
openjdk 9
OpenJDK Runtime Environment (build 9+181)
OpenJDK 64-Bit Server VM (build 9+181, mixed mode)

Default Log Settings

When -Xlog is used without further details it is equivalent to calling -Xlog:all=warning:stdout:uptime,level,tags

Show Usage Information

More information about the syntax and additional examples can be obtained by running the following
command: java -Xlog:help

% java -Xlog:help
-Xlog Usage: -Xlog[:[what][:[output][:[decorators][:output-options]]]]
         where 'what' is a combination of tags and levels of the form tag1[+tag2...][*][=level][,...]
         Unless wildcard (*) is specified, only log messages tagged with exactly the tags specified will be matched.
 
Available log levels:
 off, trace, debug, info, warning, error
 
Available log decorators:
 time (t), utctime (utc), uptime (u), timemillis (tm), uptimemillis (um), timenanos (tn), uptimenanos (un), hostname (hn), pid (p), tid (ti), level (l), tags (tg)
 Decorators can also be specified as 'none' for no decoration.
 
Available log tags:
 add, age, alloc, aot, annotation, arguments, attach, barrier, biasedlocking, blocks, bot, breakpoint, census, class, classhisto, cleanup, compaction, constraints, constantpool, coops, cpu, cset, data, defaultmethods, dump, ergo, exceptions, exit, fingerprint, freelist, gc, hashtables, heap, humongous, ihop, iklass, init, itables, jni, jvmti, liveness, load, loader, logging, mark, marking, methodcomparator, metadata, metaspace, mmu, module, monitorinflation, monitormismatch, nmethod, normalize, objecttagging, obsolete, oopmap, os, pagesize, patch, path, phases, plab, promotion, preorder, protectiondomain, ref, redefine, refine, region, remset, purge, resolve, safepoint, scavenge, scrub, stacktrace, stackwalk, start, startuptime, state, stats, stringdedup, stringtable, stackmap, subclass, survivor, sweep, task, thread, tlab, time, timer, update, unload, verification, verify, vmoperation, vtables, workgang, jfr, system, parser, bytecode, setting, event
 Specifying 'all' instead of a tag combination matches all tag combinations.
 
Described tag combinations:
 logging: Logging for the log framework itself
 
Available log outputs:
 stdout, stderr, file=<filename>
 Specifying %p and/or %t in the filename will expand to the JVM's PID and startup timestamp, respectively.
[..]

Appendix A: Word Cloud

The word-cloud used for this article was generated using the free generator at https://www.wordclouds.com/.

Search
Categories