Java 15 will see the first release of Project Loom. I believe this will be a game-changer for the JVM. In this post, I’d like to dive a bit into the reasons that lead me to believe that.
First, we need to understand the core problem. Then, I will try to describe how previous technologies try to solve it. Afterwards, we will see the approach taken by Project Loom. Finally, I’ll extrapolate on what effects the latter could have on the ecosystem.
We first have to remember that for a long time, computers had a single core. Even then, running multiple programs at the same time was a requirement: it was about running at least two, the Operating System and the program proper.
To achieve the illusion of parallelism, it relies on a trick. It runs a program, and if the program doesn’t finish under a specific timeframe, it stores its state for later usage. Then, it runs the next program to be run. There is a couple of algorithms available which allows to schedule what shall be the next program: round robin, weighted round robin, etc. The good thing is that all of it is nicely abstracted away from the developer by the concept of Operating System thread. The OS takes care of all the heavy lifting, including storing the state of execution. However, threads have two cons:
- A thread is heavyweight as it carries a lot of state with it
- A thread takes a lot of machine resources to create
In all cases, modern OS allow M threads, where M is a couple of thousands. Modern machines are multi-cores, offering N cores, two orders of magnitude lower than M. Having multiple cores, whether physical or virtual, doesn’t fundamentally change the underlying mechanism: M is much higher than N, and the OS takes care of the mapping from a thread to a core.
The above model works well in legacy scenarios, but not so well in web ones. Imagine a web server that needs to respond to an HTTP request. In the not-so-good-old-time, CGI was one way to handle requests. It mapped each request to a process, so that handling the request required to create a whole new process, and it was cleaned up after the response was sent.
Java EE application servers improved the situation a lot, as implementations kept threads in a pool to be reused later. However, imagine that the generation of the response takes time e.g. because it needs to access the database to read data. Until the database has returned data, the thread needs to wait.
This means threads are actually waiting for most of their lifetime On one side, such threads do not use any CPU on their own. On the flip side, it uses other kinds of resources, in particular memory.
Also, too many threads are a burden on the Operating System: the OS has to balance a huge number of threads on a limited number of CPU cores. This cost precious CPU cycles, and as a result, the OS is competing for CPU with applications.
Now, the number of concurrent requests can go much higher than the number of threads available to the server. Hence, a thread that blocks is a waste of resources, and can make a server unresponsive. To remedy that, more web servers can be added to handle the load: this is horizontal scaling.
In most cases, horizontal scaling is good enough. Time spent waiting on blocking threads is waste, but has no associated costs… unless one’s infrastructure resides in "the Cloud". In that case, one is paying for resources that are unused: that’s never a smart idea.
Single-threaded, reactive and Kotlin coroutine models
For a developer, managing multiple threads is complex. For example, if you have been using (or developing) Swing apps, you probably know about the "gray rectangle effect". This happens when an user interaction with a window e.g. a click, starts a long-running task. If you move another window over the Swing window, then move it out, Swing won’t repaint the area where the other window intersected with the Swing window, leaving out a ugly gray rectangle. The reason is that the long-running task is launched on the Event Dispatch Thread, instead of a dedicated thread. And this is quite easy to avoid, not even about synchronizing on shared mutable state!
To avoid that, some stacks completely disallow developers to use multiple threads. For example, the API of Node.js offers only a single non-blocking event loop thread: submitted functions are in the form of callbacks. Note that this doesn’t prevent the implementation from using multiple threads.
The Reactive approach is another alternative, actually quite similar. While it gets rid of the single-thread API limitation and offers a back pressure mechanism, it still requires non-blocking code. Because OS threads are expensive, Reactive pools them, and re-use them throughout the application lifecycle. The core process is to get a free thread from the pool, let it execute code, and then release the thread back to the pool. That’s the reason why it requires non-blocking code: if code blocks, then the executing thread won’t be released, and the pool will be exhausted at one point or another.
I watched with interest at how the Reactive model spread through the Spring ecosystem like a bonfire, although I chose to stay on the side. IMHO, Reactive suffers from several drawbacks:
- The mindset to write (and read!) reactive code is very different from the mindset to write traditional code. I willingly admit that changing one’s mindset just takes time, the duration depending on every developer.
- Though real developers don’t debug, I know a lot who do - including myself. Because of the thread-switching magic above, it’s not easy to follow the flow of a piece of code, with its associated state. This requires adequate tooling, such as IntelliJ IDEA with the relevant plugin.
- Finally, for the same reason, traditional stack traces not helpful. Some black magic allow to circumvent this. However, this is not for the faint of heart. For a complete list of options, check this document.
The Kotlin language offers an alternative to the Reactive approach: coroutines. In short, when using the
suspend keyword, the Kotlin compiler generates a finite state machine in the bytecode. The benefit is that functions called in a coroutine block look like they are executed sequentially, though they are executed in parallel - or to be more precise, have the potential to be, depending on the exact scope.
Project Loom and virtual threads
Both the Reactive model and Kotlin coroutine add an additional abstraction layer between client code and JVM threads. It’s the responsibility of the framework/library to map one to the other dynamically. The crux of the matter is that a JVM thread is a thin wrapper around an OS thread: remember that OS threads are expensive to create, and limited in quantity to a couple of thousands.
The goal of Project Loom is to actually decouple JVM threads from OS threads. When I first became aware of the initiative, the idea was to create an additional abstraction called
Fiber (threads, Project Loom, you catch the drift?). A
Fiber responsibility was to get an OS thread, make it run code, the release back into a pool, just like the Reactive stack does.
The current proposal has a huge difference: instead of introducing a new
Fiber class, it re-uses one class quite familiar to Java developers -
java.lang.Thread! Hence, in new JVM versions, some
Thread objects might be heavyweight and mapped to OS threads, while some might be virtual threads.
The main question is, now that the JVM API offers an abstraction over OS threads, what will become of other abstractions such as Reactive and coroutines? I’m bad at forecasts, but here are some possible attitudes from the companies behind Reactive/coroutines:
- An outcome is that they realize that their frameworks don’t bring any added value anymore and are just duplication. They stop their development effort, only providing maintenance releases to existing customers. They help said customers to migrate to the new
ThreadAPI, some help might be in the form of paid consulting.
- On the opposite, after having invested so much effort in their respective frameworks, they decide to continue as if nothing happened. For example, the Spring framework took care of actually designing a shared Reactive API called Reactive Streams, with no Spring dependencies. There are currently two implementations, RxJava v2 and Pivotal’s Project Reactor. On their side, JetBrains has advertised Kotlin’s coroutines as being the easiest way to run code in parallel.
- Finally, there could be a middle ground. Both frameworks will continue their lives, but change their respective underlying implementation to use virtual threads.
Because of the Sunk cost fallacy, #1 is highly unlikely: sales and marketing will push toward keeping their "competitive advantage" - whatever that means in their eyes. While some engineers will want to keep the existing code for the same reason, some others will push toward using the new API. Hence, I don’t believe #2 will happen either. However, I think power play between both engineering factions, then between them and marketing/sales will find a balance on #3.
Project Looms changes the existing
Thread implementation from the mapping of an OS thread, to an abstraction that can either represent such a thread or a virtual thread. In itself, that is an interesting move on a platform that historically put a lot more value on backward-compatibility in comparison to innovation. Compared to other recent Java versions, this feature is a real game-changer. Developers in general should start getting familiar with it as soon as possible. Developers who are about to learn about Reactive and coroutines should probably take a step back, and evaluate whether they should instead learn the new
Thread API - or not.
Thanks a lot to my colleague Jaromir Hamala for his detailed review!