Garbage Collection in Java - Part 1
What is GC and How it Works in the JVM
In my previous article, I wrote about the Java Virtual Machine (JVM) and explained its architecture. As part of the Execution Engine component, I also briefly covered the Java Garbage Collector (GC). In this article, you will learn more about the Garbage Collector and how it works.
What is Garbage Collection in Java?
Garbage Collection is the process of reclaiming the runtime unused memory by destroying the unused objects.
In languages like C and C++, the programmer is responsible for both the creation and destruction of objects. Sometimes, the programmer may forget to destroy useless objects, and the memory allocated to them is not released. The used memory of the system keeps on growing and eventually there is no memory left in the system to allocate. Such applications suffer from "memory leaks".
After a certain point, sufficient memory is not available for creation of new objects, and the entire program terminates abnormally due to OutOfMemoryErrors.
You can use methods like
free() in C, and
delete() in C++ to perform Garbage Collection. In Java, garbage collection happens automatically during the lifetime of a program. This eliminates the need to de-allocate memory and therefore avoids memory leaks.
Java Garbage Collection is the process by which Java programs perform automatic memory management. Java programs compile into bytecode that can be run on a Java Virtual Machine (JVM).
When Java programs run on the JVM, objects are created on the heap, which is a portion of memory dedicated to the program.
Over the lifetime of a Java application, new objects are created and released. Eventually, some objects are no longer needed. You can say that at any point in time, the heap memory consists of two types of objects:
- Live - these objects are being used and referenced from somewhere else
- Dead - these objects are no longer used or referenced from anywhere The garbage collector finds these unused objects and deletes them to free up memory.
How to Dereference an Object in Java
The main objective of Garbage Collection is to free heap memory by destroying the objects that don’t contain a reference. When there are no references to an object, it is assumed to be dead and no longer needed. So the memory occupied by the object can be reclaimed.
There are various ways in which the references to an object can be released to make it a candidate for Garbage Collection. Some of them are:
By making a reference null
Student student = new Student(); student = null;
By assigning a reference to another
Student studentOne = new Student(); Student studentTwo = new Student(); studentOne = studentTwo; // now the first object referred by studentOne is available for garbage collection
By using an anonymous object
How does Garbage Collection Work in Java?
Java garbage collection is an automatic process. The programmer does not need to explicitly mark objects to be deleted.
The garbage collection implementation lives in the JVM. Each JVM can implement its own version of garbage collection. However, it should meet the standard JVM specification of working with the objects present in the heap memory, marking or identifying the unreachable objects, and destroying them with compaction.
What are Garbage Collection Roots in Java?
Garbage collectors work on the concept of Garbage Collection Roots (GC Roots) to identify live and dead objects.
Examples of such Garbage Collection roots are:
- Classes loaded by system class loader (not custom class loaders)
- Live threads
- Local variables and parameters of the currently executing methods
- Local variables and parameters of JNI methods
- Global JNI reference
- Objects used as a monitor for synchronization
- Objects held from garbage collection by JVM for its purposes
The garbage collector traverses the whole object graph in memory, starting from those Garbage Collection Roots and following references from the roots to other objects.
Phases of Garbage Collection in Java
A standard Garbage Collection implementation involves three phases:
Mark objects as alive
In this step, the GC identifies all the live objects in memory by traversing the object graph.
When GC visits an object, it marks it as accessible and thus alive. Every object the garbage collector visits is marked as alive. All the objects which are not reachable from GC Roots are garbage and considered as candidates for garbage collection.
Sweep dead objects
After marking phase, we have the memory space which is occupied by live (visited) and dead (unvisited) objects. The sweep phase releases the memory fragments which contain these dead objects.
Compact remaining objects in memory
The dead objects that were removed during the sweep phase may not necessarily be next to each other. Thus, you can end up having fragmented memory space.
Memory can be compacted after the garbage collector deletes the dead objects, so that the remaining objects are in a contiguous block at the start of the heap.
The compaction process makes it easier to allocate memory to new objects sequentially.
What is Generational Garbage Collection in Java?
Java Garbage Collectors implement a generational garbage collection strategy that categorizes objects by age.
Having to mark and compact all the objects in a JVM is inefficient. As more and more objects are allocated, the list of objects grows, leading to longer garbage collection times. Empirical analysis of applications has shown that most objects in Java are short lived.
In the above example, the Y axis shows the number of bytes allocated and the X axis shows the number of bytes allocated over time. As you can see, fewer and fewer objects remain allocated over time.
In fact most objects have a very short life as shown by the higher values on the left side of the graph. This is why Java categorizes objects into generations and performs garbage collection accordingly.
The heap memory area in the JVM is divided into three sections:
Newly created objects start in the Young Generation. The Young Generation is further subdivided into:
- Eden space - all new objects start here, and initial memory is allocated to them
- Survivor spaces (FromSpace and ToSpace) - objects are moved here from Eden after surviving one garbage collection cycle. When objects are garbage collected from the Young Generation, it is a minor garbage collection event.
When Eden space is filled with objects, a Minor GC is performed. All the dead objects are deleted, and all the live objects are moved to one of the survivor spaces. Minor GC also checks the objects in a survivor space, and moves them to the other survivor space.
Take the following sequence as an example:
- Eden has all objects (live and dead)
- Minor GC occurs - all dead objects are removed from Eden. All live objects are moved to S1 (FromSpace). Eden and S2 are now empty.
- New objects are created and added to Eden. Some objects in Eden and S1 become dead.
- Minor GC occurs - all dead objects are removed from Eden and S1. All live objects are moved to S2 (ToSpace). Eden and S1 are now empty.
So, at any time, one of the survivor spaces is always empty. When the surviving objects reach a certain threshold of moving around the survivor spaces, they are moved to the Old Generation.
You can use the
-Xmn flag to set the size of the Young Generation.
Objects that are long-lived are eventually moved from the Young Generation to the Old Generation. This is also known as Tenured Generation, and contains objects that have remained in the survivor spaces for a long time.
There is a threshold defined for the tenure of an object which decides how many garbage collection cycles it can survive before it is moved to the Old Generation.
When objects are garbage collected from the Old Generation, it is a major garbage collection event.
You can use the
-Xmx flags to set the size of the initial and maximum size of the Heap memory.
Since Java uses generational garbage collection, the more garbage collection events an object survives, the further it gets promoted in the heap. It starts in the young generation and eventually ends up in the tenured generation if it survives long enough.
Consider the following example to understand the promotion of objects between spaces and generations:
- When an object is created, it is first put into the Eden space of the young generation.
- Once a minor garbage collection happens, the live objects from Eden are promoted to the FromSpace.
- When the next minor garbage collection happens, the live objects from both Eden and FromSpace are moved to the ToSpace.
This cycle continues for a specific number of times. If the object is still used after this point, the next garbage collection cycle will move it to the old generation space.
Metadata such as classes and methods are stored in the Permanent Generation. It is populated by the JVM at runtime based on classes in use by the application. Classes that are no longer in use may be garbage collected from the Permanent Generation.
You can use the
-XX:MaxPermGen flags to set the initial and maximum size of the Permanent Generation.
Starting with Java 8, the MetaSpace memory space replaces the PermGen space. The implementation differs from the PermGen and this space of the heap is now automatically resized.
This avoids the problem of applications running out of memory due to the limited size of the PermGen space of the heap. The Metaspace memory can be garbage collected and the classes that are no longer used can be automatically cleaned when the Metaspace reaches its maximum size.
In this article, we discussed Java Garbage Collection and how it works. In the next article, we will cover more about the different types of garbage collectors available in Java and how they work.
Thank you for staying with me so far. Hope you liked the article. You can connect with me on LinkedIn where I regularly discuss technology and life. Also take a look at some of my other articles and my YouTube channel. Happy reading. 🙂