following this instructions I have managed to produce only 528 bytes in size a.out (when gcc main.c gave me 8539 bytes big file initially).
main.c was:
int main(int argc, char** argv) {
return 42;
}
but I have built a.out from this assembly file instead:
main.s:
; tiny.asm
BITS 64
GLOBAL _start
SECTION .text
_start:
mov eax, 1
mov ebx, 42
int 0x80
with:
me@comp# nasm -f elf64 tiny.s
me@comp# gcc -Wall -s -nostartfiles -nostdlib tiny.o
me@comp# ./a.out ; echo $?
42
me@comp# wc -c a.out
528 a.out
because I need machine code I do:
objdump -d a.out
a.out: file format elf64-x86-64
Disassembly of section .text:
00000000004000e0 <.text>:
4000e0: b8 01 00 00 00 mov $0x1,%eax
4000e5: bb 2a 00 00 00 mov $0x2a,%ebx
4000ea: cd 80 int $0x80
># objdump -hrt a.out
a.out: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .note.gnu.build-id 00000024 00000000004000b0 00000000004000b0 000000b0 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .text 0000000c 00000000004000e0 00000000004000e0 000000e0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
SYMBOL TABLE:
no symbols
file is in little endian convention:
me@comp# readelf -a a.out
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x4000e0
Start of program headers: 64 (bytes into file)
Start of section headers: 272 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 2
Size of section headers: 64 (bytes)
Number of section headers: 4
Section header string table index: 3
now I want to execute this like this:
#include <unistd.h>
// which version is (more) correct?
// this might be related to endiannes (???)
char code[] = "\x01\xb8\x00\x00\xbb\x00\x00\x2a\x00\x00\x80\xcd\x00";
char code_v1[] = "\xb8\x01\x00\x00\x00\xbb\x2a\x00\x00\x00\xcd\x80\x00";
int main(int argc, char **argv)
{
/*creating a function pointer*/
int (*func)();
func = (int (*)()) code;
(int)(*func)();
return 0;
}
however I get segmentation fault. My question is: is this section of text
4000e0: b8 01 00 00 00 mov $0x1,%eax
4000e5: bb 2a 00 00 00 mov $0x2a,%ebx
4000ea: cd 80 int $0x80
(this machine code) all I really need? What I do wrong (endiannes??), maybe I just need to call this in different way since SIGSEGV?
The code must be in a page with execute permission. By default, stack and read-write static data (like non-const globals) are in pages mapped without exec permission, for security reasons.
The simplest way is to compile with gcc -z execstack
, which links your program such that stack and global variables (static storage) get mapped in executable pages, and so do allocations with malloc
.
Another way to do it without making everything executable is to copy this binary machine code into an executable buffer.
#include <unistd.h>
#include <sys/mman.h>
#include <string.h>
char code[] = {0x55,0x48,0x89,0xe5,0x89,0x7d,0xfc,0x48,
0x89,0x75,0xf0,0xb8,0x2a,0x00,0x00,0x00,0xc9,0xc3,0x00};
/*
00000000004004b4 <main> 55 push %rbp
00000000004004b5 <main+0x1> 48 89 e5 mov %rsp,%rbp
00000000004004b8 <main+0x4> 89 7d fc mov %edi,-0x4(%rbp)
00000000004004bb <main+0x7> 48 89 75 f0 mov %rsi,-0x10(%rbp)
'return 42;'
00000000004004bf <main+0xb> b8 2a 00 00 00 mov $0x2a,%eax
'}'
00000000004004c4 <main+0x10> c9 leaveq
00000000004004c5 <main+0x11> c3 retq
*/
int main(int argc, char **argv) {
void *buf;
/* copy code to executable buffer */
buf = mmap (0,sizeof(code),PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANON,-1,0);
memcpy (buf, code, sizeof(code));
__builtin___clear_cache(buf, buf+sizeof(code)-1); // on x86 this just stops memcpy from optimizing away as a dead store
/* run code */
int i = ((int (*) (void))buf)();
printf("get this done. returned: %d", i);
return 0;
}
output:
get this done. returned: 42
RUN SUCCESSFUL (total time: 57ms)
Without __builtin___clear_cache
, this could break with optimization enabled because gcc would think the memcpy
was a dead store and optimize it away. When compiling for x86, __builtin___clear_cache
does not actually clear any cache; there are zero extra instructions; it just marks the memory as "used" so stores to it aren't considered "dead". (See the gcc manual.)
Another option would be to mprotect
the page containing the char code[]
array, giving it PROT_READ|PROT_WRITE|PROT_EXEC
. This works whether it's a local array (on the stack) or global in the .data
.
Or if it's const char code[]
in the .rodata
section, you might just give it PROT_READ|PROT_EXEC
.
(In versions of binutils ld
from before about 2019, the .rodata
got linked as part of the same segment as .text
, and was already mapped executable. But recent ld
gives it a separate segment so it can be mapped without exec permission so const char code[]
doesn't give you an executable array anymore, but it used to so you may this old advice in other places.)