When I see the assembly code of a C app, like this:
emacs hello.c
clang -S -O hello.c -o hello.s
cat hello.s
Function names are prefixed with an underscore (e.g. callq _printf
). Why is this done and what advantages does it have?
Example:
hello.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char *myString = malloc(strlen("Hello, World!") + 1);
memcpy(myString, "Hello, World!", strlen("Hello, World!") + 1);
printf("%s", myString);
return 0;
}
hello.s
_main: ; Here
Leh_func_begin0:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
movl $14, %edi
callq _malloc ; Here
movabsq $6278066737626506568, %rcx
movq %rcx, (%rax)
movw $33, 12(%rax)
movl $1684828783, 8(%rax)
leaq L_.str1(%rip), %rdi
movq %rax, %rsi
xorb %al, %al
callq _printf ; Here
xorl %eax, %eax
popq %rbp
ret
Leh_func_end0:
From Linkers and Loaders:
At the time that UNIX was rewritten in C in about 1974, its authors already had extensive assember language libraries, and it was easier to mangle the names of new C and C-compatible code than to go back and fix all the existing code. Now, 20 years later, the assembler code has all been rewritten five times, and UNIX C compilers, particularly ones that create COFF and ELF object files, no longer prepend the underscore.
Prepending an underscore in the assembly results of C compilation is just a name-mangling convention that arose as a workaround. It stuck around for (as far as I know) no particular reason, and has now made its way into Clang.
Outside of assembly, the C standard library often has implementation-defined functions prefixed with an underscore to convey notions of magicalness and don't touch this to the ordinary programmers that stumble across them.