Advice for learning Linux x86-64 assembly & documentation

grabber picture grabber · Oct 16, 2009 · Viewed 11.7k times · Source

Does anyone have documentation pertaining to learning the fundamentals of Linux x86-64 assembly? I'm not sure whether or not to learn it as is, or to learn x86 first, and learn it later, but being as I have an x86-64 computer and not an x86, I was thinking of learning x86-64 instead ;)

Maybe someone could give me some incentive, and direction as to learning what, how, and with what documentation.

Kindly give me your most favoured documentation titles, I code a little Python, this is my first attempt at a lower level language, and I'm more than ready to dedicate to it.

Thanks all

Answer

Callum picture Callum · Nov 29, 2009

General advice:

It isn't just "x86 assembler". Each assembler is a bit different and they are not generally compatible with each other. I recommend the NASM assembler because it is widely used, easy to install, and supports 64bit assembly.

Read a good book on x86 assembler to get a feel for the basics (registers, conditional jumps, arithmetic, etc). I read Art of Assembly by Randall Hyde when I was starting out.

http://asm.sourceforge.net looks like it has some good tutorials that you might want to work through. But if you are assembling in 64bit mode, beware that the calling convention for C functions and syscalls is different.

You will need the CPU reference manuals. Personally, I prefer the AMD ones. You want volumes 1 and 3 of the CPU manual. The other volumes might be of interest as well.

64bit specific advice

64bit x86 assembly is almost the same as 32bit x86 assembly, since 64bit x86 is mostly backwards compatible with 32bit. You get access to the 64bit registers and a few other features, some obscure instructions are no longer valid, and the rest is the same as 32bit.

However, the syscall convention is completely different on 64bit Linux. Depending on your kernel, the 32bit syscalls may or may not be available. What's worse is that the 64bit calling convention is poorly documented. I only figured it out by examining the depths of the glibc source code.

To save you the hassle of finding this out the hard way, The syscall numbers are in the Linux source code under arch/x86/include/asm/unistd_64.h. The syscall number is passed in the rax register. The parameters are in rdi, rsi, rdx, r10, r8, r9. The call is invoked with the syscall instruction. The syscall overwrites the rcx register. The return is in rax. (A brief example can be found here.)