Java Virtual Machine

 Java Virtual Machine  




  You might be curious as to how the Java Virtual Machine actually works. A grasp of the fine points of the JVM gives you a greater understanding of the security structure of Java. This section unravels the mystery of the JVM.

The JVM is intended to provide a set of specifications that the Java language, compiler, and interpreter adhere to in order to ensure secure, portable programs and runtime environments. The JVM provides a strict set of rules that can be used by a developer to create an original implementation of an interpreter that runs Java code on any machine it is installed on. These rules require that the runtime interpreter include all of the following pieces:

  • A set of bytecode instructions similar to that of a CPU, which contains opcodes and operands, and their values and alignments
  • A set of registers that tracks the state of the program at a given time
  • A Java stack, which stores information about the states of methods in stack frames
  • A garbage collection heap, which stores memory that is to be allocated to objects
  • Memory areas for storage, which store constants and methods

The Bytecode Instruction Set

When Java code is compiled, it is converted to bytecode, which is similar to the assembly
language created by C and C++ compilers. Each instruction in the bytecode contains an opcode followed by an operand. The following list contains examples of opcodes and their descriptions:

  • iload loads an integer
  • aload loads a pointer
  • ior logically or two integer

Opcodes are represented by 8-bit numbers. Operands vary in length. They are aligned to eight bits, and therefore, operands larger than eight bits are divided into multiple bytes. The reason Java uses such small memory spaces is to maintain compactness of memory. The Java team felt that compact code was worth the performance hit on the CPU while locating each instruction, a hit that results from the inability of the interpreter to judge exactly where each instruction is due to the varying lengths of instructions. This decision reclaims lost performance as compact bytecode travels across networks more quickly than code found in other programming languages that contains unused memory space left free as a result of larger, fixed instruction lengths. Of course, code with fixed instruction lengths runs more quickly on the CPU because the interpreter can jump through instructions, anticipating their lengths and exact locations.

The instruction set provides specifications for opcode and operand syntax and values, and identifier values. It also includes instructions for invoking methods.

The JVM Register Set

The JVM contains four 32-bit registers that store information about the current state of the system. These registers are updated after the execution of each bytecode.

  • pc  The counter that keeps track of which bytecode in the program is currently being executed.
  • optop  The pointer to the top of the operand stack in the Java stack that is used when the program performs operations.
  • frame  The pointer to the current execution environment of the current method in the Java stack.
  • vars  The pointer to the first local variable of the current method that is executing in the Java stack.

The processor of your machine deals quickly with these registers.

The Java Stack

The Java stack provides the current parameters to bytecodes during execution of methods. Each method of a class is assigned a stack frame that is stored in the Java stack. Each stack frame holds the current status of local variables, the operand stack, and the execution environment.

The local variables for the method are stored in an array of 32-bit variables indexed by the vars register. Larger variables are divided across two local variables. When local variables are used, they are loaded onto the operand stack for the method. The operand stack is a 32-bit first in, first out (FIFO) stack that stores operands for opcodes in the JVM instruction set. These operands are both parameters used in methods' instructions, as well as results of instructions. The execution environment provides information about the current state of the method in the Java stack. It stores pointers to the previous method, pointers to its local variables, and pointers to the top and bottom of the operand stack. It might also contain debugging information.

The Garbage Collection Heap

Java's garbage collector keeps track of references to objects allocated in memory using symbolic handles. When an object is no longer being referenced during the execution of the program, the garbage collector returns the memory used by the object to its garbage collection heap. This heap is a separate area of memory in Java that is allocated when the runtime system is started. It is provided specially for allocation of memory to new objects. If the system the interpreter runs on supports virtual memory, the size of the garbage collection heap can grow as necessary.

The JVM Memory Areas

The other memory areas provided in the JVM are for storing methods and the constant pool. All of the bytecode for Java methods is stored in the method area. It also stores symbol tables for dynamic linking of classes and additional debugging information associated with a method. The constant pool area encodes string constants, class names, method names, and field names for each class. It is created by the Java compiler. These memory areas are not required to be laid out in any particular location to avoid exposure to hackers who would be able to find their code if they knew the memory map before runtime.





Copyright Manjor Inc.