Find which assembly instruction caused an Illegal Instruction error without debugging

pythonic picture pythonic · Apr 27, 2012 · Viewed 69k times · Source

While running a program I've written in assembly, I get Illegal instruction error. Is there a way to know which instruction is causing the error, without debugging that is, because the machine I'm running on does not have a debugger or any developement system. In other words, I compile in one machine and run on another. I cannot test my program on the machine I'm compiling because they don't support SSE4.2. The machine I'm running the program on does support SSE4.2 instructions nevertheless.

I think it maybe because I need to tell the assembler (YASM) to recognize the SSE4.2 instructions, just like we do with gcc by passing it the -msse4.2 flag. Or do you think its not the reason? Any idea how to tell YASM to recognize SSE4.2 instructions?

Maybe I should trap the SIGILL signal and then decode the SA_SIGINFO to see what kind of illegal operation the program does.

Answer

Diego Pino picture Diego Pino · Oct 24, 2016

Recently I experienced a crash due to a 132 exit status code (128 + 4: program interrupted by a signal + illegal instruction signal). Here's how I figured out what instruction was causing the crash.

First, I enabled core dumps:

$ ulimit -c unlimited

Interestingly, the folder from where I was running the binary contained a folder named core. I had to tell Linux to add the PID to the core dump:

$ sudo sysctl -w kernel.core_uses_pid=1

Then I run my program and got a core named core.23650. I loaded the binary and the core with gdb.

$ gdb program core.23650

Once I got into gdb, it showed up the following information:

Program terminated with signal SIGILL, Illegal instruction.
#0  0x00007f58e9efd019 in ?? ()

That means my program crashed due to an illegal instruction at 0x00007f58e9efd019 address memory. Then I switched to asm layout to check the last instruction executed:

(gdb) layout asm
>|0x7f58e9efd019  vpmaskmovd (%r8),%ymm15,%ymm0
 |0x7f58e9efd01e  vpmaskmovd %ymm0,%ymm15,(%rdi)
 |0x7f58e9efd023  add    $0x4,%rdi
 |0x7f58e9efd027  add    $0x0,%rdi

It was instruction vpmaskmovd that caused the error. Apparently, I was trying to run a program aimed for AVX2 architecture on a system which lacks support for AVX2 instruction set.

$ cat /proc/cpuinfo | grep avx2

Lastly, I confirmed vpmaskmovd is an AVX2 only instruction.