What is the reason function names are prefixed with an underscore by the compiler?

Question 1

What is the reason function names are prefixed with an underscore by the compiler?

c function assembly naming compilation

user142019 · May 6, 2011 · Viewed 14.9k times · Source

Answer

Answer

At the time that UNIX was rewritten in C in about 1974, its authors already had extensive assember language libraries, and it was easier to mangle the names of new C and C-compatible code than to go back and fix all the existing code. Now, 20 years later, the assembler code has all been rewritten five times, and UNIX C compilers, particularly ones that create COFF and ELF object files, no longer prepend the underscore.

Prepending an underscore in the assembly results of C compilation is just a name-mangling convention that arose as a workaround. It stuck around for (as far as I know) no particular reason, and has now made its way into Clang.

Outside of assembly, the C standard library often has implementation-defined functions prefixed with an underscore to convey notions of magicalness and don't touch this to the ordinary programmers that stumble across them.

Question 2

When I see the assembly code of a C app, like this:

emacs hello.c
clang -S -O hello.c -o hello.s
cat hello.s

Function names are prefixed with an underscore (e.g. callq _printf). Why is this done and what advantages does it have?

Example:

hello.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


int main() {
  char *myString = malloc(strlen("Hello, World!") + 1);
  memcpy(myString, "Hello, World!", strlen("Hello, World!") + 1);
  printf("%s", myString);
  return 0;
}

hello.s

_main:                       ; Here
Leh_func_begin0:
    pushq   %rbp
Ltmp0:
    movq    %rsp, %rbp
Ltmp1:
    movl    $14, %edi
    callq   _malloc          ; Here
    movabsq $6278066737626506568, %rcx
    movq    %rcx, (%rax)
    movw    $33, 12(%rax)
    movl    $1684828783, 8(%rax)
    leaq    L_.str1(%rip), %rdi
    movq    %rax, %rsi
    xorb    %al, %al
    callq   _printf          ; Here
    xorl    %eax, %eax
    popq    %rbp
    ret
Leh_func_end0:

What is the reason function names are prefixed with an underscore by the compiler?

Answer

Related questions