JVM Configuration

Humio® runs on the Java Virtual Machine (JVM). In this section we briefly describe things you should consider when selecting and configuring the JVM for Humio.

Which JVM?

Humio requires a Java version 11 or later JVM to function function properly. We operate Humio Cloud using the Azul-provided Zulu JVM version 11. Our Docker container uses this JVM as well.

We recommend you use one of the following excellent and well-tested distributions of the Java JVM runtime when operating Humio.

Java version 11

Provider Name Architectures
Amazon AWS OpenJDK 11 Corretto x86_64
AdoptOpenJDK.net OpenJDK 11 (HotSpot) x86_64
Azul Systems OpenJDK 11 Zulu x86_64
BellSoft OpenJDK 11 Liberica x86_64, ARMv8
Oracle Java SE 11 x86_64
Oracle OpenJDK 11 x86_64

Java version 12

Provider Name Architectures
Azul Systems OpenJDK 12 Zulu x86_64
AdoptOpenJDK.net OpenJDK 12 (HotSpot) x86_64
BellSoft OpenJDK 12 Liberica x86_64, ARMv8

Java version 13

Provider Name Architectures
Azul Systems OpenJDK 13 Zulu x86_64
AdoptOpenJDK.net OpenJDK 13 (HotSpot) x86_64
BellSoft OpenJDK 13 Liberica x86_64, ARMv8

What about…

  • Open J9 — We’ve not yet qualified Humio on OpenJ9 (vs HotSpot) and so cannot recommend using it as yet.
  • Azul Zing — While we haven’t tried Zing ourselves, our experience with Zulu leads us to believe that it is likely to work and that there may be benefits to using their proprietary, commercially supported, C4 concurrent, non- generational garbage collector.
  • Oracle’s Graal and SubstrateVM — Is an interesting alternative to the C2 HotSpot JIT in the OpenJDK. It is not yet supported for production use with Humio, we plan to investigate and support it as it matures.

Java memory options

We recommend that systems running Humio have as much RAM as possible, but not for the JVM. Humio will operate comfortably within 10 GiB for most workloads. The remainder of the RAM in your system should remain available for use as filesystem page cache.

A good rule of thumb calculation for memory allocation is as follows

  • (8 GB baseline + 1 GB per core) + that much again in off-heap memory

So, for a production installation on an 8 core VM, you would want about 64 GB of memory with JVM settings as follows

-server -Xms16G  -Xmx16G -Xss2M -XX:MaxDirectMemorySize=16G

This sets Humio to allocate a heap size of 16 GB and further allocates 16 GB for direct memory access (which is used by direct byte buffers). That will leave a further 32 GB of memory for OS processes and filesystem cache. For large installations, more memory for filesystem cache to use will translate into faster queries, so we recommend using as much memory as is economically feasible on your hardware.

For a smaller, two-core system that would look like this

-server -Xms10G  -Xmx10G -Xss2M -XX:MaxDirectMemorySize=10G

That sets Humio to allocate a heap size of 10 GB and further allocates 10 GB for direct memory access (as such, you would want a system with 32 GB of memory, most likely).

It’s definitely possible to run Humio on smaller systems with less memory than this, but we recommend a system with at least 32 GB of memory for all but the smallest installations.

To view how much memory is available for use as filesystem page cache, you can run the following command

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           125G         24G        1.7G        416K         99G         99G
Swap:           33G         10M         33G

The memory displayed in the available column is what’s currently available for use as page cache. The buff/cache column displays how much of that memory is currently being used for page cache.

If you’re installing on a system with two CPU sockets using our Ansible scripts, then you will end up with two Humio JVM processes running on your system. Under these conditions, the memory requirement will double, so keep that in mind when planning.

Garbage collection

Humio has been tested and run using the Garbage First (-XX:+UseG1) and the old parallel (-XX:+UseParallelOldGC) collectors. Both work quite well with our standard workload. While optimized not to allocate objects unless necessary, Humio does still incur a good deal of memory pressure and class unloading due to the nature of Scala on the JVM. The preference, when considering GC options, is for throughput over predictable low latency, meaning that Humio as a solution is tolerant to pauses induced by GC collections. That said, newer concurrent GC algorithms are getting better at balancing these two competing requirements and offer a more predictable application experience.

A key requirement of any GC is that it return unused memory to the operating system, as we depend on the filesystem cache for some of the system performance.

Regardless of which collector you use, we recommend that you configure the JVM for verbose garbage collector logging and then store and monitor those logs within Humio itself.

-Xlog:gc+jni=debug:file=/var/log/humio/gc.log:time,tags:filecount=5,filesize=102400

It can be helpful to request that the JVM attempt to do a bit of scavenging before stopping the entire JVM for a full collection.

-XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC

Shenandoah GC

ShenandaohGC is a new, fully concurrent, non-generational garbage collector developed for the OpenJDK. As of JDK 13 we find that it provides a low and predictable pause time while not impacting throughput and will return memory to the operating system when not in use. While not available in all Java distributions (most notably Oracle’s), it is a very good choice for Humio. To enable it, and again we recommend JDK 13 or later for this GC, use the following flags:

-XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:+ClassUnloadingWithConcurrentMark

The Z Garbage Collector (ZGC)

ZGC is also relatively new within the JVM and also has a pauseless, low latency and other features favorable to Humio. We have discovered that the ZGC (JDK 11)reserves memory as “shared memory” which has the effect of lowering the amount available for disk caching. As Humio is generally IO bound, the ability to cache as much of the block device into RAM is related to providing lower latency and higher throughput. We recommend against using the ZGC until we have tested the implications of the JEP 351 which we hope addresses this issue.

Verify physical memory is available for filesystem page cache

Once you have Humio (and perhaps also Kafka, Zookeeper, and other software) running on your server, verify that there is ample memory remaining for caching files using the command free -h. On a server with 128 GB of RAM we usually see around 90 GB as “available”. If the number is much lower, due to a large amount being either “used” or “shared”, then you may want to improve on that. However, if you have a very fast IO subsystem, such one based on a RAID 0 stripe of fast NVMe drives, you may find that using memory for caching has no effect on query performance.

You can check by dropping the OS file cache using sudo sysctl -w vm.drop_caches=3 which will drop any cached files, and then compare the speed when running the same trivial query multiple times. Using the same fixed time interval, query of a simple count() twice on a set of data that makes the query take 5-10 seconds to execute is a good test. If you benefit from the page cache you will see a much faster response on the second and following runs compared to the first run.

Another way to validate that the IO subsystem is fast is to inspect the output of iostat -xm 2 while running a query after dropping filesystem page cached data as shown above. If the NVMe-drives are close to 100% utilized, then you will benefit from having memory for page caching.

A note on NUMA (multi-socket) systems

NUMA-aware JVM will partition the heap with respect to the NUMA nodes, and when a thread creates a new object, the object is allocated in the NUMA node of the core that runs that thread (if the same thread later uses it, the object will be in the local memory). Also when compacting the heap the NUMA aware JVM avoids moving large data chunks between nodes (and reduces the length of stop-the-world events).

So on any NUMA hardware and for any Java application the -XX:+UseNUMA option should be enabled.

JEP 345: NUMA-Aware Memory Allocation for G1 is Unresolved.

Shenandoah GC does not support NUMA and the ZGC has only basic NUMA support and is enabled by default on multi-socket systems or can be explicitly requested with the -XX:+UseNUMA option.

The parallel collector (enabled by by -XX:+UseParallelGC) has been NUMA-aware for many years and is known to work quite well.

Humio fully utilizes the available IO channels, physical memory, and CPU during query execution. Coordinating memory accross cores can slow Humio down. We recommend that a single JVM be run on each separate CPU (socket, not core) and that you instruct the operating system that the process should remain on that socket using only memory most tightly bound to it. On Linux you can use the numactl executable to do this.

/usr/bin/numactl --cpunodebind=%i --membind=%i

Helpful Java/JVM resources