JVM ARCHITECTURE

Deepti Swain
InterviewNoodle
Published in
7 min readOct 17, 2021

--

It is very important, as a programmer, that we know the architecture of the particular programming language we use, as it enables us to write code more efficiently. To run any programing language applications, an application-based or process-based virtual machine acts as a run time engine, which analyzes the bytecode, interprets the code, and executes it. For E.g. JVM acts as a run time engine to run java based applications.

In this article, we will learn more deeply about the JVM architecture in Java and the different components of the JVM.

What is JVM?

Java is developed with the concept of WORA (Write Once Run Anywhere). A programmer can develop Java code on one system and can expect it to run on any other Java-enabled system easily because of Java Virtual Machine(JVM).JVM is an application-based VM, which is a part of the Java Run Time Environment(JRE) and is responsible to load and run java class files. JVM is the one which calls the main method of a java program.

Architectural Diagram of JVM:

JVM Architecture Diagram ~by Deepti Swain

As shown in the above architecture diagram, the JVM is divided into three main subsystems:

  1. ClassLoader Subsystem
  2. Memory Area Subsystem
  3. Execution Engine Subsystem

1. ClassLoader Subsystem:
It is responsible for Loading, Linking, Initialization.
i) Loading:

ClassLoader Subsystem: I) Loading ~by Deepti Swain

In the loading phase, the class loader reads each .class file and store corresponding binary data in the Method area. Below data are stored in the method area:
a) Fully qualified name of the loaded class and its immediate parent class.
b) .class file information such as the file is related to class or interface or enum, methods information, variable information, constructor information, modifiers information, constant pool information, etc.
- After loading the .class file, immediately JVM creates an object for that loaded class on the heap memory of type java.lang.class.
- These loaded class “Class object” can be used by programmers, to get class-level information such as Method information or variable information, constructor information, etc. For every loaded type, only one class object will be created even though we are using the class multiple times in our program as shown in the below image.

Class Object Info ~by Deepti Swain

Types of class loaders:
Classloader subsystem contains 3 types of subsystems and JVM follows the “Delegation-Hierarchy principle” to load classes.
1. Bootstrap class loader / Primordial class loader
2. Extension class loader
3. Application class loader / System class loader

Bootstrap class loader: This is responsible to load core java API classes(such as string class, string buffer etc.) i.e the classes present in rt.jar.The path of rt.jar is jdk/jre/lib/rt.jar. This path (jdk/jre/lib/) is called Bootstrap classpath. The bootstrap class loader is responsible to load the classes from the bootstrap classpath. The bootstrap class loader is by default available with every JVM and implemented in native languages.
2. Extension class loader: This is the child class of Bootstrap class loader.
This class loader is responsible to load classes from extension class path (jdk/jre/lib/ext/ .jar) etc. Extension class loader is implemented in java and the corresponding .class file is sun.misc.Launcher$ExtClassLoader.class
3. Application class loader/ System class loader: This is the child class of extension class loader. This class loader is responsible to load classes from the application classpath. It internally uses the environment variable classpath.
Application class loader is implemented in java and corresponding .class file is sun.misc.Launcher$AppClassLoader.

ii) Linking:
Linking consists of 3 activities Verify, Prepare, Resolve.
a) Verify/ Verification: It is the process of ensuring that, the binary representation of a class is structurally correct or not, i.e inside JVM, the Byte-code verifier will check whether the .class file is generated by a valid compiler or not, whether .class file is properly formatted or not. This is one of the reasons that Java is a secure language. If verification fails, then we will get a run time exception saying java.lang.VerifyError.
b) Prepare/Preparation: In this phase, JVM will allocate memory for class level static variables and assign default values.
c) Resolve/Resolution: It is the process of replacing symbolic memory references/names in our program with the original memory references from the Method area.

class Test{
public static void main(String[] args){
String s = new String("Adwet");
Student s1 = new Student();
}
}

For the above class, the class loader loads the Test.class, Object.class, String.class, Student.class etc to Method area. The names(every symbol used in our program such as Test, String, s, Student etc) of these classes are stored in a constant pool of test classes. In the resolution phase, these names are replaced with original memory level references from the method area.

iii) Initialization:
In this, all static variables are assigned with original values and static blocks will be executed from parent to child and from top to bottom.

Note: While loading, linking and initialization if any error occurs, then we will get a run time exception saying java.lang.linkageError

Learn more about how a Classloader subsystem works in Java.

2. Memory Area Subsystem:

Whenever JVM loads and runs a java program it needs memory to store several things such as byte code, objects, variables etc. Total JVM memory is organized into the following 5 categories.

i. Method area: For every JVM, one method area will be available. Method area will be created at the time of JVM startup. Inside the method area, class level binary data including static variables will be stored. Constant pools of a class will be stored inside the method area. Method area can be accessed by multiple threads simultaneously. Hence, method area data will not be continuous.

ii. Heap Area: For every JVM, one heap area is available. Heap area will be created at the time of JVM startup. Objects and corresponding instance variables will be stored in the heap area. The Heap area can be accessed by multiple threads hence, the data stored in the heap memory is not thread-safe. The heap area need not be continuous.

iii. Stack Area: For every thread, JVM will create a separate stack at the time of thread creation. Each and every method call performed by that thread will be stored in the stack including local variables too. After completing a method the corresponding entry will be removed. After completing all method calls, the stack will become empty. The empty stack will be destroyed by JVM just before terminating the thread. Each entry in the stack is called a stack frame or activation record. The data stored in the stack is available only for the corresponding thread and not available to the remaining threads. Hence this data is thread-safe. Each stack frame contains Local Variable Arrays, Operand Stack, Framed Data.

iv. PC registers: For every thread, separate PC(Program counter) registers will be created at the time of thread creation. PC registers contain the address of the currently executing instruction, once instruction execution completes automatically PC register will be incremented to hold the address of the next instruction.

v. Native method stacks: For every thread, JVM will create a separate native method stack. All native method calls invoked by the thread will be stored in the corresponding Native method stack.

3. Execution Engineer:

This is the central component of JVM and is responsible to execute java class files. It mainly contains three components:

i. Interpreter: This is responsible for reading byte code and interpret it into sample machine code(native code) and executes the machine code line by line. The drawback of the interpreter is, it interprets every time, even if the same method is invoked multiple times, which reduces the performance of the system.

ii. JIT(Just In Time) Compiler: The main purpose of the JIT compiler is to improve performance. Internally JIT compiler maintains a separate count for every method. Whenever JVM comes across any method call, first that method will be interpreted normally by the interpreter, and the JIT compiler increments the corresponding count variable. This process will continue for every method. Once if any method count reaches the threshold value then the JIT compiler identifies that as a repeatedly used method(Hot spot). Immediately, the JIT compiler compiles that method and generates the corresponding native code. Next time JVM comes across that method call, then JVM uses native code directly and executes it instead of interpreting it once again so that performance of the system will be improved. The threshold count varies from JVM to JVM. JIT compilation is applicable only for repeatedly required methods.

iii. Garbage Collector: GC Collects and removes unreferenced objects.

Java Native Interface(JNI): This acts as a mediator for java method calls and corresponding native libraries i.e JNI is responsible to provide information about native libraries to the JVM.

Native method libraries: This holds native libraries information.

In this article, we covered what is JVM, followed by a deep dive into JVM architecture explaining how each of the 3 subsystems i.e ClassLoader Subsystem, Memory Area Subsystem, Execution Engine Subsystem works in Java.

Did you find the content listed in this article helpful? Let me know your thoughts and any other topic you want me to cover in the comment section below. If you enjoyed this article, share it with your friends and colleagues! Happy Learning :)

--

--