Basic use of immediates vs. square brackets in YASM/NASM x86 assembly

InvalidBrainException picture InvalidBrainException · Apr 28, 2012 · Viewed 24k times · Source

Suppose I have the following declared:

section .bss
buffer    resb     1

And these instructions follow in section .text:

mov    al, 5                    ; mov-immediate
mov    [buffer], al             ; store
mov    bl, [buffer]             ; load
mov    cl, buffer               ; mov-immediate?

Am I correct in understanding that bl will contain the value 5, and cl will contain the memory address of the variable buffer?

I am confused about the differences between

  • moving an immediate into a register,
  • moving a register into an immediate (what goes in, the data or the address?) and
  • moving an immediate into a register without the brackets
    • For example, mov cl, buffer vs mov cl, [buffer]

UPDATE: After reading the responses, I suppose the following summary is accurate:

  • mov edi, array puts the memory address of the zeroth array index in edi. i.e. the label address.
  • mov byte [edi], 3 puts the VALUE 3 into the zeroth index of the array
  • after add edi, 3, edi now contains the memory address of the 3rd index of the array
  • mov al, [array] loads the DATA at the zeroth index into al.
  • mov al, [array+3] loads the DATA at the third index into al.
  • mov [al], [array] is invalid because x86 can't encode 2 explicit memory operands, and because al is only 8 bits and can't be used even in a 16-bit addressing mode. Referencing the contents of a memory location. (x86 addressing modes)
  • mov array, 3 is invalid, because you can't say "Hey, I don't like the offset at which array is stored, so I'll call it 3". An immediate can only be a source operand.
  • mov byte [array], 3 puts the value 3 into the zeroth index (first byte) of the array. The byte specifier is needed to avoid ambiguity between byte/word/dword for instructions with memory, immediate operands. That would be an assemble-time error (ambiguous operand size) otherwise.

Please mention if any of these is false. (editor's note: I fixed syntax errors / ambiguities so the valid ones actually are valid NASM syntax. And linked other Q&As for details)

Answer

Job picture Job · Apr 28, 2012

The square brackets essentially work like a dereference operator (e.g., like * in C).

So, something like

mov REG, x

moves the value of x into REG, whereas

mov REG, [x]

moves the value of the memory location where x points to into REG. Note that if x is a label, its value is the address of that label.

As for you're question:

Am I correct in understanding that bl will contain the value 5, and cl will contain the memory address of the variable buffer?

Yes, you are correct. But beware that, since CL is only 8 bits wide, it will only contain the least significant byte of the address of buffer.