What is the difference in byte code like Java bytecode and files and machine code executables like ELF?

Taylor Ramirez picture Taylor Ramirez · Aug 30, 2012 · Viewed 13k times · Source

What are the differences between the byte code binary executables such as Java class files, Parrot bytecode files or CLR files and machine code executables such as ELF, Mach-O and PE.

what are the distinctive differences between the two?

such as the .text area in the ELF structure is equal to what part of the class file?

or they all have headers but the ELF and PE headers contain Architecture but the Class file does not

Java Class File Java Class file

Elf file ELF File

PE File PE File

Answer

Technologeeks picture Technologeeks · Aug 31, 2012

Byte code is, as imulsion noted, an intermediate step, right before compilation into machine code. Because the last step is left to load time (and often runtime, as is the case with Just-In-Time (JIT) compilation, byte code is architecture independent: The runtime (CLR for .net or JVM for Java) is responsible for mapping the byte code opcodes to their underlying machine code representation.

By comparison, native code (Windows: PE, PE32+, OS X/iOS: Mach-O, Linux/Android/etc: ELF) is compiled code, suited for a particular architecture (Android/iOS: ARM, most else: Intel 32-bit (i386) or 64-bit). These are all very similar, but still require sections (or, in Mach-O parlance "Load Commands") to set up the memory structure of the executable as it becomes a process (Old DOS supported the ".com" format which was a raw memory image). In all the above, you can say , roughly, the following:

  • Sections with a "." are created by the compiler, and are "default" or expected to have default behavior
    • The executable has the main code section, usually called "text" or ".text". This is native code, which can run on the specific architecture
    • Strings are stored in a separate section. These are used for hard-coded output (what you print out) as well as symbol names.
    • Symbols - which are what the linker uses to put together the executable with its libraries (Windows: DLLs, Linux/Android: Shared Objects, OS X/iOS: .dylibs or frameworks) are stored in a separate section. Usually there is also a "PLT" (Procedure Linkage Table) which enables the compiler to simply put in stubs to the functions you call (printf, open, etc), that the linker can connect when the executable loads.
    • Import table (in Windows parlance.. In ELF this is a DYNAMIC section, in OS X this is a LC_LOAD_LIBRARY command) is used to declare additional libraries. If those aren't found when the executable is loaded, the load fails, and you can't run it.
    • Export table (for libraries/dylibs/etc) are the symbols which the library (or in Windows, even an .exe) can export so as to have others link with.
    • Constants are usually in what you see as the ".rodata".

Hope this helps. Really, your question was vague..

TG