OOM-Full-v10.gif

Trying to keep Java’s memory usage under control in a CI environment can be something of a dark art.

With the wealth of build frameworks available for Java/Android projects — Java, Gradle, Maven (not to mention Kotlin and its own tooling ecosystem) — it can be difficult to control where your memory is going and how to limit it. There are a variety of different environment variables you can set to manage memory usage, all with similar names and syntax. These variables interact with one another in ways that might not initially make much sense. By default, projects on CircleCI build in virtual environments with 4GB of RAM. This RAM is shared among all processes running in your project: databases, tests, various tools/frameworks, as well as the ravenous Java Virtual Machine (JVM).

Without any memory limits, the Java Virtual Machine is known to pre-allocate significant amounts of memory in large chunks, which sometimes results in the out-of-memory (OOM) errors you see on CircleCI and other CI platforms. Additionally, CircleCI runs on virtual machines with lots of RAM, using cgroups to allocate a slice of the pie to each individual build. When the JVM asks its host how much RAM it can use, it sees the whole pie, rather than the RAM allocated to its build’s particular cgroup.

Finally, when OOM errors do appear, they’re often little more than an exit code 137 error, buried at the bottom of a long build log. What causes these kinds of errors, and what are the best ways to mitigate them? Let’s walk through the different ways of setting JVM memory limits. You can refer to this handy chart to see exactly how these different environment variables interact.

Numbers indicate order of precedence, i.e., 0 takes greatest precedence, 3 takes least precedence.

Java environment variable java gradle maven kotlin lein
_JAVA_OPTIONS 0 0 0 0 0
JAVA_TOOL_OPTIONS 2 3 2 2 2
JAVA_OPTS no 2 no 1 no
JVM_OPTS * no no no *
LEIN_JVM_OPTS no no no no 1
GRADLE_OPTS no 1 no no no
MAVEN_OPTS no no 1 no no
CLI args 1 no no no no

*lein will pass the value of JVM_OPTS to the Java process it spawns; however, this env var does not affect lein itself (for that, use LEIN_JVM_OPTS) and will also not affect any separate Java processes launched directly (for that, use _JAVA_OPTIONS or JAVA_TOOL_OPTIONS)

_JAVA_OPTIONS
This is the most powerful Java environment variable. It’s read directly by the JVM and overwrites any other Java environment variables, as well as any arguments you pass on the command-line (i.e., java -Xmx512m -Xms64m). For this reason, _JAVA_OPTIONS_ isn’t typically recommended — a more focused approach usually gets the job done just as well.

It’s also worth noting that _JAVA_OPTIONS is Oracle-specific, so it won’t work with everything. For example, you’d need to use IBM_JAVA_OPTIONS for IBM’s Java tools.

JAVA_TOOL_OPTIONS
This is a safe choice for setting Java memory limits. It’s read by all Java virtual machines and is easily overridden, either with command-line arguments or, depending on your build tool, more specific environment variables. It’s also better at handling quotes than _JAVA_OPTIONS.

JAVA_OPTS
Somewhat misleadingly, JAVA_OPTS isn’t actually read by the JVM, but rather used by a variety of common Java-based tools/languages to pass memory limits to the JVM.

JVM_OPTS
JVM_OPTS is Clojure-specific: lein uses it to pass memory limits to the JVM. However, it doesn’t actually affect lein’s own available memory — for that, you’ll need LEIN_JVM_OPTIONS. Furthermore, it’s not natively recognized by Java, so you can’t use it to pass memory limits to Java directly; to do that, see _JAVA_OPTIONS_ or JAVA_TOOL_OPTIONS.

GRADLE_OPTS
Predictably, this variable is used to set memory limits for Gradle projects. It takes precedence over any general env vars used to set JVM memory limits—except _JAVA_OPTIONS._

MAVEN_OPTS
You can set Java memory limits for projects built with Apache Maven using MAVEN_OPTS. Like GRADLE_OPTS, this will override JAVA_TOOL_OPTIONS, but not _JAVA_OPTIONS._

Debugging OOM errors

oom1.png oom2.png oom3.png

When it comes to debugging opaque OOM errors, your best bet is to look for that exit code 137. Let’s look at Gradle builds as an example. Gradle can produce confusing error messages when builds use too much memory on Docker and are terminated by the Linux OOM killer.

Gradle works by launching child processes for the building process. When a build uses too much memory, it’s typically the fault of a child, and not the the parent, process. The child process is killed, and the parent is notified that both processes exited with code 137. The error message might say something about “unexpected process exit with code 137,” or you might see “Gradle build daemon disappeared unexpectedly” (exit code 1), or even just “./gradlew died unexpectedly,” without any exit code at all. None of these error messages mention the words “memory”, “cgroup”, “docker”, or “oom killer”, so it can be really tough to diagnose the problem.

There’s help on the horizon, though: Java has a new(ish) ability to read the cgroup memory limits of your build’s Docker container, rather than (mis)reading the total memory of the entire machine. These new options should make it easier to get the JVM to use “most” of the memory on the machine, without going over.

In conclusion, it’s best to ensure your -Xmxn maximum size is large enough for your Java/Gradle/Maven applications to build, test, and deploy, but small enough for other processes to adequately share the remaining memory in your CircleCI build container!

And, of course, we can always boost your project’s RAM if necessary.

Further reading: