|
You might be curious as to how
the Java Virtual Machine actually works. A grasp
of the fine points of the JVM gives you a
greater understanding of the security structure
of Java. This section unravels the mystery of
the JVM.
The JVM is intended to provide
a set of specifications that the Java language,
compiler, and interpreter adhere to in order to
ensure secure, portable programs and runtime
environments. The JVM provides a strict set of
rules that can be used by a developer to create
an original implementation of an interpreter
that runs Java code on any machine it is
installed on. These rules require that the
runtime interpreter include all of the following
pieces:
- A set of bytecode
instructions similar to that of a CPU,
which contains opcodes and operands, and their
values and alignments
- A set of registers
that tracks the state of the program at a
given time
- A Java stack, which
stores information about the states of methods
in stack frames
- A garbage collection
heap, which stores memory that is to be
allocated to objects
- Memory areas for
storage, which store
constants and methods
When Java code is compiled, it
is converted to bytecode, which is similar to
the assembly
language created by C and C++ compilers. Each
instruction in the bytecode contains an opcode
followed by an operand. The following
list contains examples of opcodes and their
descriptions:
- iload
loads an integer
- aload
loads a pointer
- ior
logically or two integer
Opcodes are represented by
8-bit numbers. Operands vary in length. They are
aligned to eight bits, and therefore, operands
larger than eight bits are divided into multiple
bytes. The reason Java uses such small memory
spaces is to maintain compactness of memory. The
Java team felt that compact code was worth the
performance hit on the CPU while locating each
instruction, a hit that results from the
inability of the interpreter to judge exactly
where each instruction is due to the varying
lengths of instructions. This decision reclaims
lost performance as compact bytecode travels
across networks more quickly than code found in
other programming languages that contains unused
memory space left free as a result of larger,
fixed instruction lengths. Of course, code with
fixed instruction lengths runs more quickly on
the CPU because the interpreter can jump through
instructions, anticipating their lengths and
exact locations.
The instruction set provides
specifications for opcode and operand syntax and
values, and identifier values. It also includes
instructions for invoking methods.
The JVM contains four 32-bit
registers that store information about the
current state of the system. These registers are
updated after the execution of each bytecode.
- pc The
counter that keeps track of which bytecode in
the program is currently being executed.
- optop The
pointer to the top of the operand stack in the
Java stack that is used when the program
performs operations.
- frame The
pointer to the current execution environment
of the current method in the Java stack.
- vars The
pointer to the first local variable of the
current method that is executing in the Java
stack.
The processor of your machine
deals quickly with these registers.
The Java stack provides the
current parameters to bytecodes during execution
of methods. Each method of a class is assigned a
stack frame that is stored in the Java stack.
Each stack frame holds the current status of
local variables, the operand stack, and the
execution environment.
The local variables for the
method are stored in an array of 32-bit
variables indexed by the vars register.
Larger variables are divided across two local
variables. When local variables are used, they
are loaded onto the operand stack for the
method. The operand stack is a 32-bit first in,
first out (FIFO) stack that stores operands for
opcodes in the JVM instruction set. These
operands are both parameters used in methods'
instructions, as well as results of
instructions. The execution environment provides
information about the current state of the
method in the Java stack. It stores pointers to
the previous method, pointers to its local
variables, and pointers to the top and bottom of
the operand stack. It might also contain
debugging information.
Java's garbage collector keeps
track of references to objects allocated in
memory using symbolic handles. When an object is
no longer being referenced during the execution
of the program, the garbage collector returns
the memory used by the object to its garbage
collection heap. This heap is a separate area of
memory in Java that is allocated when the
runtime system is started. It is provided
specially for allocation of memory to new
objects. If the system the interpreter runs on
supports virtual memory, the size of the garbage
collection heap can grow as necessary.
The other memory areas
provided in the JVM are for storing methods and
the constant pool. All of the bytecode for Java
methods is stored in the method area. It also
stores symbol tables for dynamic linking of
classes and additional debugging information
associated with a method. The constant pool area
encodes string constants, class names, method
names, and field names for each class. It is
created by the Java compiler. These memory areas
are not required to be laid out in any
particular location to avoid exposure to hackers
who would be able to find their code if they
knew the memory map before runtime.
|