I'm trying to understand deeper linking process and linker scripts...looking at binutils doc i found a simple linker script implementation that i've improved by adding some commands:
OUTPUT_FORMAT("elf32-i386", "elf32-i386",
"elf32-i386")
OUTPUT_ARCH(i386)
ENTRY(mymain)
SECTIONS
{
. = 0x10000;
.text : { *(.text) }
. = 0x8000000;
.data : { *(.data) }
.bss : { *(.bss) }
}
My program is a very simple program:
void mymain(void)
{
int a;
a++;
}
Now i tried to build an executable:
gcc -c main.c
ld -o prog -T my_script.lds main.o
But if i try to run prog
it receives a SIGKILL
during startup. I know that when a program is compiled and linked with the command:
gcc prog.c -o prog
the final executable is the product also of other object files like crt1.o
, crti.o
and crtn.o
but what about my case? Which is the correct way to use this linker scripts?
I suspect that your code is running just fine, and getting into trouble at the end: what do you expect to happen after the a++
?
mymain()
is just an ordinary C function, which will try to return to its caller.
But you've set it as the ELF entry point, which tells the ELF loader to jump to it once it has loaded the program segments in the right place - and it doesn't expect you to return.
Those "other object files like crt1.o
, crti.o
and crtn.o
" normally handle this stuff for C programs. The ELF entry point for a C program isn't main()
- instead, it's a wrapper which sets up an appropriate environment for main()
(e.g. setting up the argc
and argv
arguments on the stack or in registers, depending on platform), calls main()
(with the expectation that it may return), and then invokes the exit
system call (with the return code from main()
).
[Update following comments:]
When I try your example with gdb
, I see that it does indeed fail on returning from mymain()
: after setting a breakpoint on mymain
, and then stepping through instructions, I see that it performs the increment, then gets into trouble in the function epilogue:
$ gcc -g -c main.c
$ ld -o prog -T my_script.lds main.o
$ gdb ./prog
...
(gdb) b mymain
Breakpoint 1 at 0x10006: file main.c, line 4.
(gdb) r
Starting program: /tmp/prog
Breakpoint 1, mymain () at main.c:4
4 a++;
(gdb) display/i $pc
1: x/i $pc
0x10006 <mymain+6>: addl $0x1,-0x4(%ebp)
(gdb) si
5 }
1: x/i $pc
0x1000a <mymain+10>: leave
(gdb) si
Cannot access memory at address 0x4
(gdb) si
0x00000001 in ?? ()
1: x/i $pc
Disabling display 1 to avoid infinite recursion.
0x1: Cannot access memory at address 0x1
(gdb) q
For i386 at least, the ELF loader sets up a sensible stack before entering the loaded code, so you can set the ELF entry point to a C function and get reasonable behaviour; however, as I mentioned above, you have to handle a clean process exit yourself. And if you're not using the C runtime, you'd better not be using any libraries that depend on the C runtime either.
So here is an example of that, using your original linker script - but with the C code modified to initialise a
to a known value, and invoke an exit
system call (using inline assembly) with the final value of a
as the exit code. (Note: I've just realised that you haven't said exactly what platform you're using; I'm assuming Linux here.)
$ cat main2.c
void mymain(void)
{
int a = 42;
a++;
asm volatile("mov $1,%%eax; mov %0,%%ebx; int $0x80" : : "r"(a) : "%eax" );
}
$ gcc -c main2.c
$ ld -o prog2 -T my_script.lds main2.o
$ ./prog2 ; echo $?
43
$