Buffer overflow in C

ryyst picture ryyst · Jun 2, 2011 · Viewed 10.8k times · Source

I'm attempting to write a simple buffer overflow using C on Mac OS X 10.6 64-bit. Here's the concept:

void function() {
    char buffer[64];
    buffer[offset] += 7;    // i'm not sure how large offset needs to be, or if
                            // 7 is correct.
}

int main() {

    int x = 0;
    function();
    x += 1;
    printf("%d\n", x);      // the idea is to modify the return address so that
                            // the x += 1 expression is not executed and 0 gets
                            // printed

    return 0;
}

Here's part of main's assembler dump:

...
0x0000000100000ebe <main+30>:   callq  0x100000e30 <function>
0x0000000100000ec3 <main+35>:   movl   $0x1,-0x8(%rbp)
0x0000000100000eca <main+42>:   mov    -0x8(%rbp),%esi
0x0000000100000ecd <main+45>:   xor    %al,%al
0x0000000100000ecf <main+47>:   lea    0x56(%rip),%rdi        # 0x100000f2c
0x0000000100000ed6 <main+54>:   callq  0x100000ef4 <dyld_stub_printf>
...

I want to jump over the movl instruction, which would mean I'd need to increment the return address by 42 - 35 = 7 (correct?). Now I need to know where the return address is stored so I can calculate the correct offset.

I have tried searching for the correct value manually, but either 1 gets printed or I get abort trap – is there maybe some kind of buffer overflow protection going on?


Using an offset of 88 works on my machine. I used Nemo's approach of finding out the return address.

Answer

Ben Jackson picture Ben Jackson · Jun 2, 2011

This 32-bit example illustrates how you can figure it out, see below for 64-bit:

#include <stdio.h>

void function() {
    char buffer[64];
    char *p;
    asm("lea 4(%%ebp),%0" : "=r" (p));  // loads address of return address
    printf("%d\n", p - buffer);         // computes offset
    buffer[p - buffer] += 9;            // 9 from disassembling main
}

int main() {
    volatile int x = 7;
    function();
    x++;
    printf("x = %d\n", x); // prints 7, not 8
}

On my system the offset is 76. That's the 64 bytes of the buffer (remember, the stack grows down, so the start of the buffer is far from the return address) plus whatever other detritus is in between.

Obviously if you are attacking an existing program you can't expect it to compute the answer for you, but I think this illustrates the principle.

(Also, we are lucky that +9 does not carry out into another byte. Otherwise the single byte increment would not set the return address how we expected. This example may break if you get unlucky with the return address within main)

I overlooked the 64-bitness of the original question somehow. The equivalent for x86-64 is 8(%rbp) because pointers are 8 bytes long. In that case my test build happens to produce an offset of 104. In the code above substitute 8(%%rbp) using the double %% to get a single % in the output assembly. This is described in this ABI document. Search for 8(%rbp).

There is a complaint in the comments that 4(%ebp) is just as magic as 76 or any other arbitrary number. In fact the meaning of the register %ebp (also called the "frame pointer") and its relationship to the location of the return address on the stack is standardized. One illustration I quickly Googled is here. That article uses the terminology "base pointer". If you wanted to exploit buffer overflows on other architectures it would require similarly detailed knowledge of the calling conventions of that CPU.