Mini JVM Interpreter
“ვიცოდეთ ვისი გორისა ვართ” these words in my native language mean ”one should know his origins”. Just like good old Georgian quote it’s always recommended to know the origins of the field you are working in. Nor is java programming an exception. That’s why one day I decided to dig deeper into this topic to understand Java “origins” better. Prerequisites for this article are: advanced understanding of java, bachelor degree in computer science and min 5 years of working experience as java engineer…. :) :) just kiddin come on, the only thing u need for this article is a strong will and a full glass of red wine to enjoy as much as I did while writing it.
JVM, stands for java virtual machine, a core engine that drives millions of java programs all around the globe. JVM makes it possible for Java-based software programs to follow the “write once, run anywhere” approach. You can run the same java program on any machine that has jvm installed on it. Consider JVM as an intermediate layer of software between java code and the machine. Before JVM gets into the business java compiler compiles
.java files into
.class files. The only difference between them is that
.class files contain byte code representation of
.java files. As you can see from the image below, JVM starts working after getting byte codes of classes represented as
As I have already mentioned, when you compile a
.java source file, it is converted into byte code as a
.class file. When you try to use this class in your program, the class loader loads it into the main memory. As u can see there are 3 main parts in JVM architecture. Class loader, responsible for class loading, Data Area, responsible for memory allocation of the program and Execution Engine, one that is responsible for executing the byte code received from the
.class files. As for now we will stop investigating the overall architecture of the JVM, and start digging deeper into the first part of an Execution Engine, Interpreter.
Last chance to get a glass of wine, this is highly recommended!!! :)
Interpreter is a box that interprets byte code. I know, this is the most complete explanation that u ever heard of :)). Basically it really is all what it’s name suggests. Interpreter takes byte code instructions and interprets them to the machine code(this is the actual process of running your code). To implement our mini Interpreter first we need to define commands that will be interpreted by our engine. This is the list of instructions that I defined for this system. Instructions actually are byte codes that are just represented as enum here. Don’t worry we will discuss them in details in just a few seconds.
Next, we need something that represent variables in java. By variables I mean something that has type and value. For that we have JVMValue. This class has three constructor alike methods(entry()) that create JVMValue objects with corresponding parameters.
Next, next we need a block of memory where we can keep all the values and variables that we encounter in the scope of this discussion. For general memory of the system we have JVMStack, which is the same as usual Stack, with a pinch of extra features. JVMStack has 3 methods:
addIntegerConstant just adds new integer into the stack.
multiplyIntegers: pops(gets) 2 values from the end of stack, multiplies them and pushes(adds) the result back into the stack.
addIntegers: does the same as multiplyIntegers, except instead of multiplications it sums 2 values.
And now, for all the java fans around the world, this is the moment u all been waiting for… At this point u are already into the topic and we can start testing our interpreter. Initially lets test it for the primitive arithmetic operations. Let’s have a look at sum test. I know that we haven’t yet discussed the execMethod() function from the line 22 but wait lemme first translate this block of code into ordinary language. In buffer1 array of bytes we have 4 instructions that will be executed by our interpreter sequentially. INTEGER_CONST_1, INTEGER_CONST_1, INTEGER_ADD, INTEGER_RETURN
these 4 instructions mean: get value of integer 1, again get value of integer 1, sum them and return the result. In this case by adding 1 and 1 we get 2, just as u can see from the line 24. We also check value type on line 23, but that’s not the main thing here. At this point don’t pay attention to new JVMLocalVars() as well. All you need to know is that execMethod() function gets above instructions, and somehow executes this kind of code return 1 + 1;
Just a quick remainder to refill your glass of wine :))
Now its time to get into the mystery box and examine how the actual logic behind the execMethod() works. Here it comes, JVMInterpreter is the class that is responsible for the actual execution of the instructions. The static block just initializes array of JVMCommands, don’t pay much attention to it. The main stuff happens in the execMethod itself. First we create stack(memory) for keeping our values. Then we just iterate over the array of instructions. For now just look at the marked switch cases. As u can see above the first 2 instructions are INTEGER_CONST_1. When it comes to switch case, addIntegerConstant(1) just adds int 1 into the stack. Here it happens twice. Afterwards we have INTEGER_ADD, INTEGER_RETURN. From marked switch cases it is very clear that if an instruction is equal to INTEGER_ADD sumIntegers() method is called, that sums previously added integers and updates the stack. And finally when the instruction call is equal to INTEGER_RETURN the result value is just popped from the stack and returned. I hope this is clear to you and we can continue with another example (In case all this was not clear to u, just get more alcohol and retry reading the paragraph).
In the next example we will try to store local variable with its value and after some operations load it into the result. The following test states: get int 5 -> store it in the memory at the address of SOKHUMI -> increment the variable stored at the address of SOKHUMI by 2 -> get the value from the address of SOKHUMI and return it. That’s all. The code for this would look something like this: int a = 5; a++; return a;
Hmmm…. now we have to store local variables somewhere. For that we are not going to use JVMStack, instead we will use JVMLocalVars class to store local variables. As u can see when INTEGER_STORE instruction is called we call jvmLocalVars.store method to store the value of int 5 at the address SOKHUMI which is here represented as the next argument after INTEGER_STORE, therefore we get currentIndex++th element from the instructions array. To increment value of 5 by 2 we call INTEGER_INCREMENT instruction that calls jvmLocalVars.incrementIntegerBy method with corresponding parameters. For loading the final result we just call jvmLocalVars.load method an push the result value into the stack, this makes return value ready for the INTEGER_RETURN instruction. To see the implementation of JVMLocalVars u can see the second screen below. The implementation is pretty straight forward, u can easily get into it.
At this point of reading my dear friend, u should already be proud to understand one of the main principles of Java’s core engine, JVM. Hope u enjoyed this talk. Lemme give u one last reminder: try to relax with a bit of red wine, and just enjoy coding.
For whole project u can visit this github repo: