/ GRAAL, VM,, AOT,, COMPILATION,, SUBSTITUTION

Coping with incompatible code in Graal VM AOT compilation

The last two blog posts were focused on how to write a custom Kubernetes controller in Java. As usual, I’m writing a demo along with the posts: that allows me to face real issues, and be able to detail them. In this context, I had to handle a couple of them when implementing the demo code. This post is dedicated to one of them: the compilation of not-compatible Java code with Graal VM’s native image AOT compilation.

The context

In order for a Java Kubernetes controller to be on par with a Go-native one, there are a couple of requirements:

  1. To be of acceptable size
  2. To start up in a reasonable time

Docker images running JVM applications cannot comply the first requirements as they embed the JVM itself. Moreover, the JVM platform is notoriously famous to provide great performances for long-running processes…​ at the cost of higher startup times. To be able to cope with those requirements, one option is to use the Java 11 platform and modules to produce a standalone executable that doesn’t depend on the platform. Another option is to use Graal VM’s Ahead-Of-Time compilation capabilities to produce a native executable. This second option is the one I chose.

Graal VM’s AOT compilation’s concept is easy to grasp. At build time, the native-image executable runs all executions paths starting from the main entry point. As usual, this works like a breeze for "Hello World" applications. Unfortunately, a controller is a bit more complex than that.

The issue

In particular, it requires a way to interact with the Kubernetes API server (please see this introduction to the Kubernetes architecture). For the demo, I’m using the Fabric8’s Kubernetes Client. The command-line is the following:

native-image -jar /var/jvm-operator.jar -H:+ReportExceptionStackTraces

With the Docker image created, I schedule the pod on Kubernetes. This logs the following error stack trace:

Exception in thread "main" java.lang.ExceptionInInitializerError
	at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:290)
	at java.lang.Class.ensureInitialized(DynamicHub.java:467)
	at okhttp3.OkHttpClient.<clinit>(OkHttpClient.java:127)
	at com.oracle.svm.core.hub.ClassInitializationInfo.invokeClassInitializer(ClassInitializationInfo.java:350)
	at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:270)
	at java.lang.Class.ensureInitialized(DynamicHub.java:467)
	at okhttp3.OkHttpClient$Builder.<init>(OkHttpClient.java:475)
	at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:73)
	at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:64)
	at io.fabric8.kubernetes.client.BaseClient.<init>(BaseClient.java:51)
	at io.fabric8.kubernetes.client.BaseClient.<init>(BaseClient.java:43)
	at io.fabric8.kubernetes.client.DefaultKubernetesClient.<init>(DefaultKubernetesClient.java:89)
	at ch.frankel.kubernetes.extend.Sidecar.main(Sidecar.java:13)
Caused by: java.nio.charset.UnsupportedCharsetException: UTF-32BE
	at java.nio.charset.Charset.forName(Charset.java:531)
	at okhttp3.internal.Util.<clinit>(Util.java:75)
	at com.oracle.svm.core.hub.ClassInitializationInfo.invokeClassInitializer(ClassInitializationInfo.java:350)
	at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:270)

As seen from the stack trace, the problem comes from Square’s OkHttp, a dependency of the Kubernetes Client. The guilty code is the following (abridged):

public final class Util {

  private static final ByteString UTF_8_BOM = ByteString.decodeHex("efbbbf");
  private static final ByteString UTF_16_BE_BOM = ByteString.decodeHex("feff");
  private static final ByteString UTF_16_LE_BOM = ByteString.decodeHex("fffe");
  private static final ByteString UTF_32_BE_BOM = ByteString.decodeHex("0000ffff");
  private static final ByteString UTF_32_LE_BOM = ByteString.decodeHex("ffff0000");

  public static final Charset UTF_8 = Charset.forName("UTF-8");
  public static final Charset ISO_8859_1 = Charset.forName("ISO-8859-1");
  private static final Charset UTF_16_BE = Charset.forName("UTF-16BE");
  private static final Charset UTF_16_LE = Charset.forName("UTF-16LE");
  private static final Charset UTF_32_BE = Charset.forName("UTF-32BE");
  private static final Charset UTF_32_LE = Charset.forName("UTF-32LE");

  public static Charset bomAwareCharset(BufferedSource source, Charset charset) throws IOException {
    if (source.rangeEquals(0, UTF_8_BOM)) {
      source.skip(UTF_8_BOM.size());
      return UTF_8;
    }
    if (source.rangeEquals(0, UTF_16_BE_BOM)) {
      source.skip(UTF_16_BE_BOM.size());
      return UTF_16_BE;
    }
    if (source.rangeEquals(0, UTF_16_LE_BOM)) {
      source.skip(UTF_16_LE_BOM.size());
      return UTF_16_LE;
    }
    if (source.rangeEquals(0, UTF_32_BE_BOM)) {
      source.skip(UTF_32_BE_BOM.size());
      return UTF_32_BE;
    }
    if (source.rangeEquals(0, UTF_32_LE_BOM)) {
      source.skip(UTF_32_LE_BOM.size());
      return UTF_32_LE;
    }
    return charset;
  }
}

Initialization takes place at runtime: when the class is "loaded" - there’s no real classloading with native executables - each Charset constant runs their respective Charset.forName(). On the system I run on, there are no UTF_32_BE nor UTF_32_LE charsets - game over.

Available solutions

This is a big issue, and here are a couple of solutions I tried with their respective results.

Add all charsets

Graal VM’s native image compilation has an option to add all charsets to the final executable - -H:+AddAllCharsets. This didn’t work until recently: I opened an issue, and despite of a slight delay, it is fixed in the latest versions (from 20.0 onward).

There are two downsides with that approach:

  1. It’s specific to charsets' issues
  2. It adds all charsets to the final executable, and thus makes its size a bit larger

Fix the library

Another much more cumbersome solution is to actually fix the library. Given OkHttp is Open Source, its source is available on GitHub. It’s a no-brainer to remove the guilty code, build the fixed version and add the new artifact in the Maven POM to replace the original one.

Graal VM advertises this solution as the preferred one.

Substitutions

Yet, there are times when the source code is not available, or when the artifact cannot easily be changed either. For that reason, Graal VM provides a substitution mechanism. At AOT compile-time, it can change specific bytecode by replacing or deleting it.

The first step is to add the Substrate VM JAR to the classpath:

<dependency>
  <groupId>com.oracle.substratevm</groupId>
  <artifactId>svm</artifactId>
  <version>19.2.1</version>
  <scope>provided</scope>
</dependency>

Note that the scope is provided, so as not to include it in the final artifact.

Documentation for substitutions at the time of this writing is very sparse. What follows is what I came with during development, and may contain inaccuracies - or be completely different from what is actually implemented by the time you use it.

Substitutions are developed in Java - though I infer it would be possible to write them in any language that produces Java-compatible bytecode. There are two ways to configure code substitution.

With JSON configuration files

While I found a configuration file sample online, I didn’t find the documentation about the expected file format. Those files need to be referenced on the command-line when running native-image.

-H:SubstitutionFiles=...  Comma-separated list of file names with declarative substitutions. Default: None
-H:SubstitutionResources  Comma-separated list of resource file names with declarative substitutions. Default: None
With annotations

Annotations are found in the com.oracle.svm.core.annotate package.

The main ones are TargetClass to reference which original class needs to be updated, and @Substitute to actually update the code itself.

Here’s the code with which I fix the above issue:

@TargetClass(Util.class)                                  (1)
public final class okhttp3_internal_Util {

    @Alias                                                (3)
    private static ByteString UTF_8_BOM;
    @Alias                                                (3)
    private static ByteString UTF_16_BE_BOM;
    @Alias                                                (3)
    private static ByteString UTF_16_LE_BOM;
    @Alias                                                (3)
    public static Charset UTF_8;
    @Alias                                                (3)
    private static Charset UTF_16_BE;
    @Alias                                                (3)
    private static Charset UTF_16_LE;

    @Substitute                                           (2)
    public static Charset bomAwareCharset(BufferedSource source, Charset charset) throws IOException {
        if (source.rangeEquals(0, UTF_8_BOM)) {
            source.skip(UTF_8_BOM.size());
            return UTF_8;
        }
        if (source.rangeEquals(0, UTF_16_BE_BOM)) {
            source.skip(UTF_16_BE_BOM.size());
            return UTF_16_BE;
        }
        if (source.rangeEquals(0, UTF_16_LE_BOM)) {
            source.skip(UTF_16_LE_BOM.size());
            return UTF_16_LE;
        }
        return charset;
    }
}
1 @TargetClass references the class to update
2 @Substitute references the method to update. By default, it will be the method with the same name as the annotated one. The annotations allows values to change that default behavior.
3 Because the changed method references fields on the original class, the class needs to provide such references. To achieve that, a field with the same name as the original one is created with the @Alias annotation.

Conclusion

Graal VM’s AOT compilation offers two main advantages: small-sized executables and fast startup times. But to benefit from them, one has to actually compile standard bytecode to native code. For applications beyond "Hello World", there’s a chance it’s not a trivial effort. Even if one’s code is AOT-friendly, issues may arise from dependent libraries.

To cope with that, the most straightforward way is to use one of Graal VM’s capabilities, such as the ability to add all charsets. An alternative is to fix the dependency to make it AOT-compatible. When that is not possible, Graal VM offers a mechanism to substitute the original bytecode with the one of one’s choosing. While documentation can be improved, it allows a lot, like deleting code or updating it.

Nicolas Fränkel

Nicolas Fränkel

Developer Advocate with 15+ years experience consulting for many different customers, in a wide range of contexts (such as telecoms, banking, insurances, large retail and public sector). Usually working on Java/Java EE and Spring technologies, but with focused interests like Rich Internet Applications, Testing, CI/CD and DevOps. Currently working for Hazelcast. Also double as a teacher in universities and higher education schools, a trainer and triples as a book author.

Read More