The difference between mov and movl instruction in X86? and I meet some trouble when reading assembly

Ding Xin picture Ding Xin · Jun 5, 2018 · Viewed 24.6k times · Source

Recently, I read some books about computer science. I wrote some C code, and disassembled them, using gcc and objdump.

The following C code:

#include <stdio.h>
#include <stdbool.h>

int dojob()
{
    static short num[ ][4] = { {2, 9, -1, 5},   {3, 8, 2, -6}};
    static short *pn[ ] = {num[0], num[1]};
    static short s[2] = {0, 0};
    int i, j;

    for (i=0; i<2; i++) {
        for (j=0; j<4; j++){
            s[i] += *pn[i]++;
        }
        printf ("sum of line %d: %d\n", i+1, s[i]);
    }

    return 0;
}

int main ( )
{
    dojob();
}

got the following assembly code (AT&T syntex; only assembly of function dojob and some data is list):

00401350 <_dojob>:
  401350:   55                      push   %ebp
  401351:   89 e5                   mov    %esp,%ebp
  401353:   83 ec 28                sub    $0x28,%esp
  401356:   c7 45 f4 00 00 00 00    movl   $0x0,-0xc(%ebp)
  40135d:   eb 75                   jmp    4013d4 <_dojob+0x84>
  40135f:   c7 45 f0 00 00 00 00    movl   $0x0,-0x10(%ebp)
  401366:   eb 3c                   jmp    4013a4 <_dojob+0x54>
  401368:   8b 45 f4                mov    -0xc(%ebp),%eax
  40136b:   8b 04 85 00 20 40 00    mov    0x402000(,%eax,4),%eax
  401372:   8d 48 02                lea    0x2(%eax),%ecx
  401375:   8b 55 f4                mov    -0xc(%ebp),%edx
  401378:   89 0c 95 00 20 40 00    mov    %ecx,0x402000(,%edx,4)
  40137f:   0f b7 10                movzwl (%eax),%edx
  401382:   8b 45 f4                mov    -0xc(%ebp),%eax
  401385:   0f b7 84 00 08 50 40    movzwl 0x405008(%eax,%eax,1),%eax
  40138c:   00 
  40138d:   89 c1                   mov    %eax,%ecx
  40138f:   89 d0                   mov    %edx,%eax
  401391:   01 c8                   add    %ecx,%eax
  401393:   89 c2                   mov    %eax,%edx
  401395:   8b 45 f4                mov    -0xc(%ebp),%eax
  401398:   66 89 94 00 08 50 40    mov    %dx,0x405008(%eax,%eax,1)
  40139f:   00 
  4013a0:   83 45 f0 01             addl   $0x1,-0x10(%ebp)
  4013a4:   83 7d f0 03             cmpl   $0x3,-0x10(%ebp)
  4013a8:   7e be                   jle    401368 <_dojob+0x18>
  4013aa:   8b 45 f4                mov    -0xc(%ebp),%eax
  4013ad:   0f b7 84 00 08 50 40    movzwl 0x405008(%eax,%eax,1),%eax
  4013b4:   00 
  4013b5:   98                      cwtl   
  4013b6:   8b 55 f4                mov    -0xc(%ebp),%edx
  4013b9:   83 c2 01                add    $0x1,%edx
  4013bc:   89 44 24 08             mov    %eax,0x8(%esp)
  4013c0:   89 54 24 04             mov    %edx,0x4(%esp)
  4013c4:   c7 04 24 24 30 40 00    movl   $0x403024,(%esp)
  4013cb:   e8 50 08 00 00          call   401c20 <_printf>
  4013d0:   83 45 f4 01             addl   $0x1,-0xc(%ebp)
  4013d4:   83 7d f4 01             cmpl   $0x1,-0xc(%ebp)
  4013d8:   7e 85                   jle    40135f <_dojob+0xf>
  4013da:   b8 00 00 00 00          mov    $0x0,%eax
  4013df:   c9                      leave  
  4013e0:   c3                      ret    


Disassembly of section .data:

00402000 <__data_start__>:
  402000:   08 20                   or     %ah,(%eax)
  402002:   40                      inc    %eax
  402003:   00 10                   add    %dl,(%eax)
  402005:   20 40 00                and    %al,0x0(%eax)


Disassembly of section .bss:

...

00405008 <_s.1927>:
  405008:   00 00                   add    %al,(%eax)
    ...

I have two questions:

  1. I don't understand the difference between mov and movl instruction? Why the compiler generate mov for some code, and movl for others?

  2. I completely understand the meaning of the C code, but not the assembly that the compiler generated. Who can make some comments for it for me to understand? I will thank a lot.

Answer

Volont&#233; du Peuple picture Volonté du Peuple · Jun 5, 2018

The MOVL instruction was generated because you put two int (i and j variables), MOVL will perform a MOV of 32 bits, and integer' size is 32 bits.

a non exhaustive list of all MOV* exist (like MOVD for doubleword or MOVQ for quadword) to allow to optimize your code and use the better expression to gain most time as possible.

PS: may be the -M intel objdump's argument can help you to have a better comprehension of the disassembly, a lot of man on the Intel syntax can may be find easily.