Reputation: 511
I have been figuring out the exact working of an interpreter, have googled around and have come up with some conclusion, just wanted it to be rectified by someone who can give me a better understanding of the working of interpreter.
So what i have understood is:
Now i am still not clear with the sub process that happens in between i.e.
Some more questions:
Upvotes: 13
Views: 13004
Reputation: 116878
An interpreter is a software program that converts code from high level language to machine format.
No. That's a compiler. An interpreter is a computer program that executes the instructions written in a language directly. This is different from a compiler that converts a higher level language into a lower language. The C compiler goes from C to assembly code with the assembler (another type of compiler) translates from assembly to machine code -- modern C compilers do both steps to go from C to machine code.
In Java, the java compiler does code verification and converts from Java source to byte-code class files. It also does a number of small processing tasks such as pre-calculation of constants (if possible), caching of strings, etc..
now platform for a java interpreter is the JVM, in which it runs, so basically it is going to produce code which can be run by JVM.
The JVM operates on the bytecode directly. The java interpreter is integrated so closely with the JVM that they shouldn't really be thought of as separate entities. What also is happening is a crap-ton of optimization where bytecode is basically optimized on the fly. This makes calling it just an interpreter inadequate. See below.
so it takes the bytecode produces intermediate code and the target machine code and gives it to JVM.
The JVM is doing these translations.
JVM in turns executes that code on the OS platform in which JVM is implemented or being run.
I'd rather say that the JVM uses the bytecode, optimized user code, the java libraries which include java and native code, in conjunction with OS calls to execute java applications.
now i am still not clear with the sub process that happens in between i.e. 1. interpreter produces intermediate code. 2. interpreted code is then optimized. 3. then target code is generated 4. and finally executed.
The Java compiler generates bytecode. When the JVM executes the code, steps 2-4 happen at runtime inside of the JVM. It is very different than C (for example) which has these separate steps being run by different utilities. Don't think about this as "subprocesses", think about it as modules inside of the JVM.
so is the interpreter alone responsible for generating target code ? and executing it ?
Sort of. The JVM's interpreter by definition reads the bytecode and executes it directly. However, in modern JVMs, the interpreter works in tandem with the Just-In-Time compiler (JIT) to generate native code on the fly so that the JVM can have your code execute more efficiently.
In addition, there are post-processing "compilation" stages which analyze the generated code at runtime so that native code can be optimized by inlining often-used code blocks and through other mechanisms. This is the reason why the JVM load spikes so high on startup. Not only is it loading in the jars and class files, but it is in effect doing a cc -O3
on the fly.
and does executing means it gets executed in JVM or in the underlying OS ?
Although we talk about the JVM executing the code, this is not technically correct. As soon as the byte-code is translated into native code, the execution of the JVM and your java application is done by the CPU and the rest of the hardware architecture.
The Operating System is the substrate that that does all of the process and resource management so the programs can efficiently share the hardware and execute efficiently. The OS also provides the APIs for applications to easily access the disk, network, memory, and other hardware and resources.
Upvotes: 14
Reputation: 109547
There are two ways of executing a program.
.c
) to machine code, on Windows .exe
. This can then be executed independent of the compiler.This compilation can be done by compiling several .c
files to several object files (intermediate products), and then linking them into a single application or library.
.java
) and "immediately" executes the program.With java the approach is a bit hybrid/stacked: the java compiler javac
compiles .java
to .class
files, and possibles zips those in .jar
(or .war
, .ear
...). The .class files consist of a more abstract byte code, for an abstract stack machine.
Then the java runtime java
(call JVM, java virtual machine, or byte code interpreter) can execute a .class/.jar. This is in fact an interpreter of java byte code.
Nowadays it also translates (parts) of the byte code at run time to machine code. This is also called a just-in-time compiler for byte code to machine code.
In short:
- a compiler
just creates code;
- an interpreter
immediately executes.
An interpreter will loop over parsed commands / a high level intermediate code, and interprete every command with a piece of code. Indirect an in principle slow.
Upvotes: 1
Reputation: 728
I'll answer based on my experience on creating a DSL.
C is compiled because you run pass the source code to the gcc and runs the stored program in machine code.
Python is interpreted because you run programs by passing the program source to the interpreter. The interpreter reads the source file and executes it.
Java is a mix of both because you "compile" the Java file to bytecode, then invokes the JVM to run it. Bytecode isn't machine code, it needs to be interpreted by the JVM. Java is in a tier between C and Python because you cannot do fancy things like a "eval" (evaluating chunks of code or expressions at runtime as in Python). However, Java has reflection abilities that are impossible to a C program to have. In short, the design of Java runtime being in a intermediary tier between a pure compiled and a interpreted language gives the best (and the worst) of the two words in terms of performance and flexibility.
However, Python also has a virtual machine and it's own bytecode format. The same applies to Perl, Lua, etc. Those interpreters first converts a source file to a bytecode, then they interpret the bytecode.
I always wondered why doing this is necessary, until I made my own interpreter for a simulation DSL. My interpreter does a lexical analysis (break a source in tokens), converts it to a abstract syntax tree, then it evaluates the tree by traversing it. For software engineering sake I'm using some design patterns and my code heavily uses polymorphism. This is very slow in comparison to processing a efficient bytecode format that mimics a real computer architecture. My simulations would be way faster if I create my own virtual machine or use a existent one. For evaluating a long numeric expression, for instance, it'll be faster to translate it to something similar to assembly code than processing a branch of a abstract tree, since it requires calling a lot of polymorphic methods.
Upvotes: 3
Reputation: 1215
There are 2 main steps to a java application: compilation, and runtime. Each process has very different functions and purposes. The main processes for both are outlined below:
[com.sun.tools.javac][1]
usually found in the tools.jar file, traditionally in your $JAVA_HOME - the same place as java.jar, etc.Compilation steps:
[com.sun.tools.JCTree][3]
. The overall idea is that there is a java object for each Expression and each Statement. At this point in time relatively little is known about actual "types" that each represent. The only thing that is checked for at the creation of the AST is literal syntaxUpvotes: 8
Reputation: 718798
1) An interpreter is a software program that converts code from high level language to machine format.
Incorrect. An interpreter is a program that runs a program expressed in some language that is NOT the computer's native machine code.
There may be a step in this process in which the source language is parsed and translated to an intermediate language, but this is not a fundamental requirement for an interpreter. In the Java case, the bytecode language has been designed so that neither parsing or a distinct intermediate language are necessary.
2) speaking specifically about java interpreter, it gets code in binary format (which is earlier translated by java compiler from source code to bytecode).
Correct. The "binary format" is Java bytecodes.
3) now platform for a java interpreter is the JVM, in which it runs, so basically it is going to produce code which can be run by JVM.
Incorrect. The bytecode interpreter is part of the JVM. The interpreter doesn't run on the JVM. And the bytecode interpreter doesn't produce anything. It just runs the bytecodes.
4) so it takes the bytecode produces intermediate code and the target machine code and gives it to JVM.
Incorrect.
5) JVM in turns executes that code on the OS platform in which JVM is implemented or being run.
Incorrect.
The real story is this:
1 - A typical bytecode interpreter does some work to map abstract stack frames and object layouts to concrete ones involving target-specific sizes and offsets. But to call this an "intermediate code" is a stretch. The interpreter is really just enhancing the bytecodes.
Upvotes: 9