Generational Garbage Collection and Heap Memory

Garbage collection and heap memory - cartoon of toy garbage collector

Last week we had a brief look at the garbage collectors available in two Java VMs. We also mentioned the importance of choosing the right one for the task.

This week we’ll look at what is actually garbage, and the idea of generational garbage collection. We’ll also look at heap memory requirements.

What is Garbage?

An object is considered to be garbage when it cannot be reached from any reference in any other live object. If that happens, its memory can be freed and reused by the JVM.

The simplest garbage collection strategy would be to iterate over every reachable object each time the collector runs. Any leftover objects would then be garbage. The more live objects we have, the longer this would take. This would be very time consuming for large applications with lots of live data.

Generational Garbage Collection

We would obviously like to speed up this simple strategy. One of the easiest ways would be to split the heap into a number of smaller memory areas, and run the garbage collector over each area separately. This technique is called generational collection.

Most applications have some common memory usage characteristics that we can use to minimise the work of a garbage collector. The most important observed characteristic is called the weak generational hypothesis. This means that many objects survive for only a short time. To reuse a common phrase, they “live fast and die young”. If we can put these objects in the same memory area, the garbage collector could potentially run much faster.

Generations

Heap memory is split into generations. A generation is a memory pool (a part of heap memory) that contains objects of different ages. Garbage collection runs in each generation when that particular generation fills up.

Many garbage collectors have two generations: young and old. The young generation is further divided into three areas: Eden and two equal-sized survivor spaces. There is also a certain amount of virtual space in both the old and the young generations. This is uncommitted heap space which will be used as the heap grows.

Most objects are initially allocated in Eden (as in the biblical first garden of Eden). This is a pool dedicated to young objects (the young generation). Most young objects also die in Eden.

When the young generation fills up, it triggers a minor collection where only the young generation is collected. The garbage in other generations isn’t collected. Collection time is proportional to how many live objects are being collected. We should try to have objects collected in the young generation. The more short-lived objects in our program, the better. A young generation full of dead objects can be reclaimed very quickly. The young generation uses a fast, less accurate collector that might leave some garbage in memory.

What happens to live objects that remain in Eden after a minor collection? These are copied to one of the survivor spaces.

One survivor space is empty at any time. During garbage collection, live objects in Eden and the other survivor space are copied to this empty survivor space. After collection, Eden and the source survivor space are emptied. In the next garbage collection, the two survivor spaces swap around. The most recently filled survivor space now becomes the source of live objects to be copied into the empty destination survivor space.

Objects are copied between survivor spaces in this way until they’ve been copied a certain number of times, or there isn’t enough memory left in the destination space. These objects are then copied into the old generation space. This process is called aging.

A fraction of the surviving objects from the young generation are moved to the old generation during each minor collection. Eventually the old generation fills up and must be collected. This causes a major garbage collection where the entire heap is collected. Major collections usually take much longer to run than minor collections because a larger number of objects are involved.

Default Heap Sizes

There are two important factors affecting garbage collection performance. These are total available memory and the proportion of the heap allocated to the young generation.

The default maximum heap size is calculated by the JVM on startup. From Java 11, all the garbage collectors use the same calculation as the parallel collector. The client JVM does a similar calculation that gives a smaller maximum heap size than the server JVM.

Unless the initial and maximum heap sizes are specified on the command line, they’re calculated based on the amount of physical memory on the machine. The following are the default sizes in Java 11:

  • The maximum heap size is set to one quarter of the physical memory.
  • The initial heap size is set to 1/64th of the physical memory.
  • The maximum young generation size is set to one third of the total heap size.

There is no single correct way to choose the size of the heap and the individual generations. The optimal sizes are determined by how the application uses memory, as well as pause time and throughput requirements. This will also determine the choice of garbage collector. The JVM’s default choice of a GC isn’t always optimal. Based on our requirements, we can choose memory sizes and garbage collectors with command-line options.

Next week we’ll look in more detail at some of the command line options to select memory sizes and ratios.

Until then, stay safe and keep on learning!

Leave a Comment

Your email address will not be published. Required fields are marked *

Code like a Java Guru!

Thank You

We're Excited!

Thank you for completing the form. We're excited that you have chosen to contact us about training. We will process the information as soon as we can, and we will do our best to contact you within 1 working day. (Please note that our offices are closed over weekends and public holidays.)

Don't Worry

Our privacy policy ensures your data is safe: Incus Data does not sell or otherwise distribute email addresses. We will not divulge your personal information to anyone unless specifically authorised by you.

If you need any further information, please contact us on tel: (27) 12-666-2020 or email info@incusdata.com

How can we help you?

Let us contact you about your training requirements. Just fill in a few details, and we’ll get right back to you.

Your Java tip is on its way!

Check that incusdata.com is an approved sender, so that your Java tips don’t land up in the spam folder.

Our privacy policy means your data is safe. You can unsubscribe from these tips at any time.