what's in a .exe file?

Gordon Gustafson picture Gordon Gustafson · Sep 30, 2009 · Viewed 19k times · Source

So a .exe file is a file that can be executed by windows, but what exactly does it contain? Assembly language that's processor specific? Or some sort of intermediate statement that's recognized by windows which turns it into assembly for a specific processor? What exactly does windows do with the file when it "executes" it?

Answer

Michael picture Michael · Sep 30, 2009

MSDN has an article "An In-Depth Look into the Win32 Portable Executable File Format" that describes the structure of an executable file.

Basically, a .exe contains several blobs of data and instructions on how they should be loaded into memory. Some of these sections happen to contain machine code that can be executed (other sections contain program data, resources, relocation information, import information, etc.)

I suggest you get a copy of Windows Internals for a full description of what happens when you run an exe.

For a native executable, the machine code is platform specific. The .exe's header indicates what platform the .exe is for.

When running a native .exe the following happens (grossly simplified):

  • A process object is created.
  • The exe file is read into that process's memory. Different sections of the .exe (code, data, etc.) are mapped in separately and given different permissions (code is execute, data is read/write, constants are read-only).
  • Relocations occur in the .exe (addresses get patched if the .exe was not loaded at its preferred address.)
  • The import table is walked and dependent DLL's are loaded.
  • DLL's are mapped in a similar method to .exe's, with relocations occuring and their dependent DLL's being loaded. Imported functions from DLL's are resolved.
  • The process starts execution at an initial stub in NTDLL.
  • The initial loader stub runs the entry points for each DLL, and then jumps to the entry point of the .exe.

Managed executables contain MSIL (Microsoft Intermediate Language) and may be compiled so they can target any CPU that the CLR supports. I am not that familiar with the inner workings of the CLR loader (what native code initially runs to boot strap the CLR and start interpreting the MSIL) - perhaps someone else can elaborate on that.