读书人

How JVM works

发布时间: 2012-09-10 22:20:12 作者: rapoo

How JVM works?

JVM is a very important part for JAVA. It stands for Java Virtual Machine.



The gap between human to machine

Before understanding the theory of how JVM works, we'd better be aware of why we called Java, C, and C# as high level language.

The common characteristics of those languages are they are human readable, but they are not machine readable. Before the machine running the application written by high level languages, the source code must be converted to a specific format that can be understood by computer (CPU), no matter which language you use.




As mentioned earlier, there exists multiple CPUs, so we need separate compilers for different hardwares (CPUs). For example, the same C code will have to be compiled using Apple Macintosh compatiable compilers in order to run on the Apple computers; if the users also want to run the same code on Windows running on the Intel platform, then user need another C-compiler for Windows.


Simply put a compiler converts a source code file (which is a simple text file) into an executable file that can be run on the host computer. But in effect, the process is more complex than it.?


Below is an example of how C compiler works (the details pls see link:?http://www.codeproject.com/Articles/1825/The-Common-Language-Runtime-CLR-and-Java-Runtime-E#_interpreters):



How JVM works



Note:


1. if the picture above is lost, see the copy in the "My Picture"!




What is interpreters?

Looks similar but different with compiler, interpreters are another extreme to running programming languages. Pure interpreters do not do translation work like compilers.


Interpreters take the code written by high level language code and execute them one by one, so?Pure interpreters have?no chance to do any code optimizations at all. ?And it also unable to check the syntax like compilers.?


Examples of pure interpreters are some scripting languages that interact with operating systems. The shell scripts in Linux, the Batch files (.bat) and command files (.cmd) in Windows are all examples of pure interpreted languages.


Below is a figure shows how pure interpreters work:


How JVM works


What is the hybrid approach?

But most of the popular modern languages are not pure interpreter based, they are either compiled (like C and C++) or hybrid approach (like Java).


Below is a figure that shows how the hybrid compiler-interpreter work:


How JVM works


As is obvious from the above diagrams, today's popular interpreted languages are not purely-interpreted. They follow the "compilation" technique to produce an intermediate code (e.g. Microsoft's Intermediate Language - MSIL, Sun's Java Byte Code etc.). It is this intermediate language that the interpreter works on, and not the original high level source code. This approach rids (avoids) many of the problems inherent in pure-interpreted languages, and gives many of the advantages of fully-compiled languages.

?

?

?

The execution mechanism of compiled and interpreted language:

A compiler does this conversion off-line and in one go (as discussed in the Who can compensate the gap? section); whereas the interpreter does this conversion one-program statement-by-one.


A compiled program runs in a?fetch-execute cycle?whereas an interpreted program runs in adecode-fetch-execute cycle. The decoding is done by the interpreter, whereas the fetch and execute operations are done by the CPU. In an interpreter the bottleneck is the decoding phase, and hence an interpreted program may be 30-100% slower than a compiled program.


Below are two figures that illustrates the flow of execution of compilers (first figure) and interpreters (second figure):


How JVM works




How JVM works


It is evident from the above flowcharts, that an interpreted program has an overhead of decoding each statement one-by-one; thus in an interpreted program the bottleneck is the decoding process.


Both compiled and interpreted approaches have their own advantages and disadvantages, the details are not seeked later. Readers must NOTE THAT both of those two approaches eventually convert the source code to machine language, but the process are different.





Compare and Contrast Compiled and Interpreted languages (extreme important to link the concept of compiler&interpreter with the next section which discuss the Java platform independence, JIT compiler and .NET IL compiler):



Languages can be developed either as fully-compiled, pure-interpreted, or hybrid compiled-interpreted. As a matter of fact, most of the current programming languages have both a compiled and interpreted versions available.

Both compiled and interpreted approaches have their advantages and disadvantages. Let's start with the compiled languages.


Compiled languages (Sample: C and C++)

  1. One of the biggest advantages of Compiled languages is their execution speed. A program written in C/C++ runs 30-70 % faster then an equivalent program written in Java.
  2. Compiled code also takes less memory as compared to an interpreted program.
  3. On the down side - a compiler is much more difficult to write than an interpreter.
  4. A compiler does not provide much help in debugging a program - how many times have you received a "Null pointer exception" in your C code and have spent hours trying to figure out where in your source code did the exception occurredHow JVM works. (Maybe this is the reason of why debugging C program is such an annoying work!!!)
  5. The executable Compiled code is much bigger in size than an equivalent interpreted code e.g. a C/C++ .exe file is much bigger than an equivalent Java .class file
  6. Compiled programs are targeted towards a particular platform and hence are platform dependent.
  7. Compiled programs do not allow security to be implemented with in the code - e.g. a compiled program can access any area of the memory, and can do whatever it wants with your PC (most of the viruses are made in compiled languages).
  8. Due to loose security and platform dependence - a compiled language is not particularly suited to be used to develop Internet or web-based applications.

Interpreted languages

  1. Interpreted language provides excellent debugging supportHow JVM works. A Java programmer only spends a few minutes fixing a "Null pointer exception", because Java runtime not only specifies the nature of exception but also gives the exact line number and function call sequence (the famous stack trace information) where the exception occurred. This facility is something that a compiled language can never provide.
  2. Another advantage is that Interpreters are much easier to build then a compiler.
  3. One of the biggest advantages of Interpreters is that they make platform-independence possible.
  4. Interpreted language also allow high degree of security - something badly needed for an Internet application.
  5. An intermediate language code size is much smaller than a compiled executable code.
  6. Platform independence, and tight security are the two most important factors that make an interpreted language ideally suited for Internet and web-based applications.
  7. Interpreted languages have some serious drawbacks. The interpreted applications take up more memory and CPU resources. This is because in order to run a program written in interpreted language; the corresponding interpreter must be run first. Interpreters are sophisticated, intelligent and resource hungry programs and they take up lot of CPU cycles and RAM.
  8. Due to interpreted application's decode-fetch-execute cycle; they are much slower than compiled programs.
  9. Interpreters also do lot of code-optimization, security violation checking at run-time; these extra steps take up even more resources and further slows the application down.



Platform dependence issues for compiled languages:

As explained above, after the compilers compile the source code to the .obj code, then a linker converts it to an executable code. Both the .obj and the executable code are mahince/ platform dependent.?

In brief, C/ C++ are platform dependent and it is a shortcoming of it.



How about Java?

To develop a Java application, there are a package you must have: the JDK (Java Development Kit) and install it on the computer. Like the SDK (Software Development Kit) of other languages, the JDK is a comprehensive set of software that includes all the bits and pieces required for developing Java applications.



JDK includes:

JVM (Java Virtual Machine)JRE (Java Runtime Environment) ?- Note that JVM is actually a part of JRE.Java packages and framework classesJavac (compiler)Java debugger.


After complete the application, programmer can use compiler to compile the source code (.java) and produce the class file (.class). The class file is an intermediate java byte code file.


The byte code file is tricky becasue this file is the machine independent intermediate code that can be executed on any computer that with the JRE installed.


What makes Java the platform independence is the UBIQUITY of JRE. JREs are available for most of the commercial and popular platforms. Programmers compelete the code once and the same program will run on any platform.


Note that the JDK must compatiable with the platform, which means that differnt platform need to install different JDK. See the below figure:


How JVM works

?

?

What is JVM? (extract from web, see the resource at reference section)

Before I discuss the JVM in details, let me clarify a few related terms.

读书人网 >编程

热点推荐