Why Is Java Stuck With Complex GCs?

Modern programming languages have relatively simple garbage collectors compared to Java. Why is Java different?

Jul 23, 2022

Java comes with a variety of garbage collectors with different capabilities, complexities, and performance characteristics. However, to get good performance, Java programs usually need a complex garbage collector. Yet other high-performance programming languages, such as Go and Julia, work quite well with much simpler garbage collectors. Why is that?

In this story, we will look at the language design choices made in Java, which cause the language to rely so heavily on sophisticated garbage collectors for performance.

Memory fragmentation—Impact on Java GC design. Why fragmentation matter less to other languages.
Compacting Garbage Collectors — Move around previously allocated bits of memory. Introduce delays.
Generational Garbage collectors — Solve delays by dividing allocated objects into long lived and short lived objects.
Value types — How lacking value types put more pressure on the GC.

The History of Java Reliance on Complex Garbage Collectors

Java is a language that basically outsourced memory management entirely to its garbage collector. This turned out to be a big mistake. However, to be able to explain why, I need to cover more of the details.

Let us start at the beginning. It is 1991 and the work on Java has begun. Garbage collectors are all the rage. Research looks very promising, and the designers of Java are placing their bets on advanced garbage collectors being able to solve essentially all challenges in managing memory.

For this reason, all objects in Java are designed to be allocated on the heap, except for primitive types such as integers and floating-point values. When talking about memory allocation, we generally distinguish between what is called the heap and the stack. The stack is very fast to use, but has limited space and can only be used for objects that do not need to exist beyond the lifetime of a function call. It is only for local variables.

Showing how three objects foo, bar and qux could be allocated on either a stack or a heap in memory.

The heap can be used for all objects. Java basically ignored the stack and chose to allocate everything on the heap, except primitives like integers and floats. Whenever you write new Something() in Java, you are consuming memory on the heap.

However, this type of memory management is actually quite expensive in terms of memory usage. You would think that creating an object with only a 32-bit integer would only require 4 bytes of memory.

class Knight {
   int health;
}

However ,in order for the garbage collector to work, Java stores a header with information such as:

Type — To identify the class or type of the object.
Lock — Used in synchronize statements.
Mark — Used during mark and sweep face of the garbage collector.

This data typically amount to 16 bytes. Hence, there is a ratio of 4:1 in header data to actual data. The C++ source code for a Java object defined as: OpenJDK base class.

class oopDesc {
    volatile markOop  _mark;   // for mark and sweep
    Klass*           _klass;   // the type
}

Memory Fragmentation

The next problem is memory fragmentation. When Java allocates an array of objects, what it really does is create an array of references pointing to objects at some other location in memory. These objects could end up scattered around heap memory. That is bad for performance because modern microprocessors don't read individual bytes of data. Because initiating a memory transfer is slow, the microprocessor always reads a large contiguous block of memory each time it tries to access one particular memory location.

This block of memory is called a cache line. The CPU has its own high-speed memory called cache. This is of a much smaller size than main memory. It is used to store the most recently accessed objects because those are likely to get accessed again. If main memory is fragmented, that means cache lines will be fragmented and the CPU cache will get filled up with lots of useless data.

The CPU doesn't load individual bytes. It loads multiple bytes called a cache line.

Compacting Garbage Collectors

To avoid fragmenting memory, Java developers have invested a ton of money into advanced garbage collectors. These collector do something called compaction. Compaction involves moving objects around memory and collect them into contiguous blocks a memory. This is not cheap. Not only does moving the blocks from one memory location to another cost CPU cycles, but updating every reference to these objects to point to the new locations also costs CPU cycles.

Doing these updates requires freezing all threads. You cannot update references while they are being used. This typically causes Java programs to have complete freezes of several hundred milliseconds where objects get moved around, references updated and unused memory reclaimed.

Generational Garbage Collectors

To reduce the long pauses introduced by compaction, Java uses what is called generational garbage collectors. These are based on the following premise:

that most values allocated in a program are quickly unused, so there is an advantage for the GC to spend more time looking at recently allocated objects.

That is why Java partition their allocated objects into two groups:

Old objects — Objects that have survived multiple mark and sweep operations by the GC. A generation counter is updated on each mark and sweep to keep track of how old the objects are.
Young objects — These have a low generation counter. Meaning they have only recently been allocated.

Java more actively investigates recently allocated objects and checks whether they should be reclaimed or moved. As objects age, they get moved out of the young generation area.

All of this naturally creates more complexity. It requires more development. It requires paying for more talented developers, costing more money.

Value Types

How do other modern programming languages avoid all the problems we have just outlined which Java attempts to solve by advanced garbage collection systems? One approach is to design the language so that one allocates fewer objects and thus need to free fewer objects. Such a strategy will reduce the work of a garbage collector. Consider creating multiple copies of simple Point object in Java:

// Java - Make an an array of 15 000 Point objects
// (pseudo code. Class definitions and private/public removed)

class Point {
    int x, y;
}

static void main(String[] args) {
    Point[] points = new Point[15000];
    for (int i = 0; i < points.length; i++) {
        points[i] = new Point();
    }
}

We will compare this Java code example with a Go code example which also allocates 15 000 Point objects.

// Go - Make an an array of 15 000 Point objects

struct {
    X, Y int
}

func main() {
    var points [15000]Point
}

In our Java code example we had to do 15 000 separate allocations in a for-loop, each producing a separate reference that must be managed. Every Java Point object gets the 16-byte header overhead I wrote about earlier. Neither in Go, Julia or Rust do you get this overhead. The objects are generally header-less.

In Java, the GC gets 15 000 separate objects it must track and manage. With Go you allocate only a single large chunk of memory containing all 15 000 points.

Go, Julia, Swift, Rust and many other modern languages allow you to create an array of thousands of objects with a single allocation, because they support what we call value types. Let me illustrate with another code example.

// Java - Define a rectangle

class Rect {
    Point min, max;
}

The min and max defines the extent of the rectangle. We can define a similar rectangle in Go.

// Go - Define a rectangle

type Rect struct {
   Min, Max Point
}

In Go this type becomes one contiguous block of memory when instantiated. In Java we get a Rect object with references to two separate objects, the min and the max point objects. Thus in Java, an instance of Rect would require 3 allocations, but only 1 allocation in Go, Rust, C/C++ and Julia.

On the left Java style memory fragmentation. On the right contiguous blocks of memory possible in Go, C/C++, Julia etc.

The lack of value types created significant problems when porting Git to Java. Without value types, it was hard to get good performance. As Shawn O. Pearce remarks on the JGit developer mailing list:

JGit struggles with not having an efficient way to represent a SHA-1. C can just say unsigned char[20] and have it inline into the container's memory allocation. A byte[20] in Java will cost an additional 16 bytes of memory, and be slower to access because the bytes themselves are in a different area of memory from the container object. We try to work around it by converting from a byte[20] to 5 ints, but that costs us machine instructions.

What are we talking about there? In Go, I can do the same thing as C/C++ and define a structure like this:

type Sha1 struct {
   data [20]byte
}

Those bytes will then be part of one block of memory. Java will create a pointer to somewhere else in memory.

Java developers realize they screwed up and that you really need value types for good performance. You can call that statement hyperbole, but then you need to explain Project Valhalla. This is an effort spearheaded by Oracle to give Java value types and the reasons they articulate for doing so point exactly to what I am talking about here.

Next, Garbage Collection in Go

In my next story we will cover the Go garbage collector in more detail. The purpose of this story was to give background information to help understand why Go does not need such a sophisticated garbage collector as Java. The focus here has been how how the Java choice of not having value types leads to allocation of large number of small objects, which would potentially get scattered around the heap unless we do compaction. Compaction again cause a need to introduce pauses which again push for a generational style garbage collector. Thus we can see how we get pushed towards a more sophisticated GC solution.

Erik Explores

Go Doesn't Need a Complex Java GC

Language design matters. Many developers are under the false impression that implementation and language designs are completely detached from each other. How often do you not hear a developer claim: There are no fast or slow languages, only fast and slow implementations of a programming language…

2 years ago · Erik Engheim

The next story will focus more on the language design choices made in Go to reduce the need for this kind of garbage collector. That is however not the only factor. Go unlike Java was made in a world where multi-core solutions dominate. That is a hardware reality which has also had major impact on how the Go garbage collector has been designed.