What does compiling WITH_PIC (-DWITH_PIC, --with-pic) actually do?

Jeff picture Jeff · Aug 2, 2013 · Viewed 9.5k times · Source

When compiling binaries from source, what are the real-world differences between generating PIC objects or not? At what point down the road would someone say, "I should have generated/used PIC objects when I compiled MySQL." Or not?

I've read Gentoo's Introduction to Position Independent Code, Position Independent Code internals, HOWTO fix -fPIC errors, Libtool's Creating object files, and Position Independent Code.

From PHP's ./configure --help:

--with-pic: Try to use only PIC/non-PIC objects [default=use both].

From MySQL's cmake -LAH .:

-DWITH_PIC: Generate PIC objects

That info is a good start, but leaves me with a lot of questions.

From what I understand, it turns on -fPIC in the compiler, which in turn generates PIC objects in the resulting binaries/libraries. Why would I want to do that? Or vice-versa. Maybe it's riskier or could potentially make the binary less stable? Maybe it should be avoided when compiling on certain architectures (amd64/x86_64 in my case)?

The default MySQL build sets PIC=OFF. The official MySQL release build sets PIC=ON. And PHP "tries to use both." In my tests setting -DWITH_PIC=ON results in slightly larger binaries:

          PIC=OFF     PIC=ON
mysql     776,160    778,528
mysqld  7,339,704  7,476,024

Answer

stefan.schwetschke picture stefan.schwetschke · Aug 12, 2013

There are two concepts one should not confuse:

  1. Relocatable binaries
  2. Position independent code

They both deal with similar problems, but on a different level.

The problem

Most processor architectures have two kinds of addressing: absolute and relative. Addressing is usually used for two types of access: Accessing data (read, write, etc.) and executing a different part of the code (jump, call, etc.). Both can be done absolutely (call the code located on a fixed address, read data at a fixed address) or relative (jump to five instructions back, read relative to a pointer).

Relative addressing usually costs both, speed and memory. Speed, because the processor must calculate the absolute address from the pointer and the relative value before it can access the real memory location or the real instruction. Memory, because an additional pointer must be stored (usually in a register, which is very fast but also very scarce memory).

Absolute addressing is not always feasible, because when implemented naively, one must know all addresses at compile time. In many cases, this is impossible. When calling code from an external library, one might not know, on which memory location the operating system will load the library. When addressing data on the heap, one will not know in advance, which heap block the operating system will reserve for this operation.

Then there are many technical details. E.g. the processor architecture will only allow relative jumps up to a certain limit; all wider jumps must then be absolute. Or on architectures with a very wide address range (e.g. 64 bit or even 128 bit), relative addressing will lead to more compact code (because one can use 16 bit or 8 bit for relative addresses, but absolute addresses must always be 64 bit or 128 bit).

Relocatable binaries

When programs use absolute addresses, they make very strong assumptions about the layout of the address space. The operating system might not be able to fulfill all these assumptions. To ease this problem, most operating systems can use a trick: The binaries are enriched with additional metadata. The operating system then uses this metadata to alter the binary during runtime, so the modified assumptions fit to the current situation. Usually the metadata describe the position of instructions in the binary, which use absolute positioning. When the operating system then loads the binary, it changes the absolute addresses stored in these instructions when necessary.

An example for these metadata are the "Relocation Tables" in the ELF file format.

Some operating systems use a trick, so they need not always process every file before running it: They preprocess the files and change the data, so their assumptions will very likely fit the situation at runtime (and hence no modification is needed). This process is called "prebinding" on Mac OS X and "prelink" on Linux.

Relocatable binaries are produced at linker level.

Position independent code (PIC)

The compiler can produce code, that uses only relative addressing. This could mean relative addressing for data and code or only for one of these categories. The option "-fPIC" on gcc e.g. means relative addressing for code is enforced (i.e. only relative jumps and calls). The code can then run located on any memory address without any modification. On some processor architectures, such code will not always be possible, e.g. when relative jumps are limited in their scope (e,g, maximum 128 instructions wide relative jumps are allowed).

Position independent code is handled on the compiler level. Executables containing only PIC code need no relocation information.

When is PIC code needed

In some special cases, one absolutely needs PIC code, because reloction during loading is not feasible. Some examples:

  1. Some embedded systems can run binaries directly from the file system, without first loading them into memory. This is usually then the case, when the file system is already in memory, e.g. in ROM or FLASH memory. The executalbes then start much faster and need no extra part of the (usually scarce) RAM. This feature is called "execute in place".
  2. You are using some special plugin system. An extreme case would be so called "shell code", i.e. code injected using a security hole. You will then usually not know where your code will be located at runtime and the executable in question will not provide a relocation service for your code.
  3. The operating system does not support relocatable binaries (usually due to scarce resources, e.g. on a embedded platform)
  4. The operating system can cache common memory pages between running programs. When binaries ere changed during relocation, this caching will no longer work (because each binary has its own version of the relocated code).

When PIC should be avoided

  1. In some cases it might be impossible for the compiler, to make everything position independent (e.g. because the compiler is not "clever" enough or because the processor architecture is too restricted)
  2. The position independent code might be too slow or too big because of the many pointer operations.
  3. The optimizer might have problems with the many pointer operations, so it will not apply necessary optimizations and the executable will run like molasse.

Advice / Conclusion

PIC code might be needed because of some special constraints. In all other cases, stick with the defaults. If you do not know about such constraints, you don't need "-fPIC".