Class Files in Java 2

The Traditional Development Life Cycle

Java is a compiled language. That is, source code is written in a high-level language and then converted through a process of compilation to a machine-level language, the Java bytecode, which then runs on the Java Virtual Machine (JVM). Before we look more closely at Java bytecode.

Program files are recognized in different ways depending on the operating environment. On most desktop operating systems, program files are recognized first by the file extension (such as exe or com) and secondly by the file format itself. Executable files contain information in a header which informs the operating system that this file is a program and has certain requirements in order to run. These requirements include such things as the address at which the program should be loaded, other supporting files which will be required and so on.

When the operating system attempts to run a program file, it loads the file and ensures that the header is legitimate, that is, that it describes a real program. The header also indicates where the starting point of the program itself is. The program is stored in the program file as machine code instructions. These instructions are numeric values which are read and interpreted by the processor as it executes. Having validated the header, the operating system starts executing the code at the indicated starting point.

it should be clear that anyone with a good understanding of the header format and of the machine code for a particular operating system could construct a program file using little more than an editor capable of producing binary files.

Of course this is not how programs are produced. The closest that anyone gets to this is writing assembler code. Assembler language programming is very low-level. Its statements, after macro expansion, usually translate into one or at most two machine language instructions. The assembler source code is then fed through an assembler which converts the (almost) human readable code into machine code, generates the appropriate header and finally outputs an executable file.

Most programs, however, are written in a high-level language such as C, C++, COBOL and so forth. It is the task of the compiler to translate high-level instructions into low-level machine code in the most optimal way. The resultant machine code output is generally very efficient, although –depending on the compiler – it may be possible to write more efficiently in assembler language. Because different compilers manage the translation and optimization process in different ways, they will produce different output for the same source code. In general it is true to say that the higher level the source language, the more scope there is for variation in the resultant executable file since there will be more possible translations of each high-level statement into low-level machine code.

During the compilation process, high-level features such as variable and function names are replaced by references to addresses in memory and by machine code instructions, which cause the appropriate address to be accessed (in the case of variables) or jumped to (in the case of functions).

In the case of both assembler language and high-level language programming, the output of the assembler or compilation phase is generally not immediately executable. Instead, an intermediate file (known as an object module or object file 1 ) is produced. One object file is produced for each source file compiled, regardless of the content or structure of the source code. These object modules are then combined using a tool called a linker which is responsible for producing the final executable file (or shared library). The linker ensures that references to a function or variable in one object module from another object module are correctly resolved.

compile

Figure . Program Compilation and Linking

In summary then:
• An object file contains the machine code which is the actual program plus some additional information describing any dependencies on other object files.
• An executable file is a collection of object files with all inter-file dependencies resolved, together with some header information which identifies the file as executable.