Move quadword between xmm and general-purpose register in ml64?

0xbe5077ed picture 0xbe5077ed · Jul 16, 2014 · Viewed 7.2k times · Source

In a simple program written for Microsoft's x64 assembler, I want to move a 64-bit value between an SSE register (say xmm0) and a general-purpose register (say rcx), as in <Intel syntax in MASM>:

mov xmm0, rcx
...
mov rcx, xmm0

These two lines generate the following error messages, respectively, from ml64.exe:

  • error A2152: coprocessor register cannot be first operand
  • error A2070: invalid instruction operands

However, it is clearly possible to accomplish this simple task in x64. For example, the following is a functioning x64 program that I can assemble and run in GAS <AT&T syntax using GCC 4.8.2>:

.text
    .globl main
main:
    movl $1, %ecx
    movq %rcx, %xmm0
    movq %xmm0, %rax
    ret

As expected, the return value of this program is 1 and the objdump output for main() is:

1004010d0:   b9 01 00 00 00          mov    $0x1,%ecx
1004010d5:   66 48 0f 6e c1          movq   %rcx,%xmm0
1004010da:   66 48 0f 7e c0          movq   %xmm0,%rax
1004010df:   c3                      retq

So my question is, how can I accomplish this in MASM given that ml64.exe is producing the above errors?

Answer

PhiS picture PhiS · Aug 14, 2014

The MOV instruction cannot move data between a general-purpose register and an xmm register. The instruction you are looking for is MOVQ (like in the A&T syntax code you show), as defined in Intel's instruction set manuals. (HTML extract here: https://www.felixcloutier.com/x86/movd:movq)

The fact that ML64 does not accept MOVQ is in disagreement with Intel's manuals, and therefore - in my view at least - a bug (or at least an inconsistency).

ML64 does seem to use MOVD in its place, even for 64-bit registers. You can verify this by disassembling the code it generates.


Note that there are two different movq instructions (not counting load and store forms as separate):

  • One is movq xmm, xmm/m64 form, the MMX/SSE2 instruction that copies between vector registers or loads/stores. This existed in 32-bit mode with MMX (and SSE2), and the opcode always implies a 64-bit transfer (zero-extending to 128 with an XMM destination). ML64 uses movq for this form.

  • The other is the 64-bit version of movd xmm, r/m32 that can move data between XMM or MMX registers and GP-integer registers like RCX, or memory. This form is new with x86-64 (which includes MMX and SSE2); the opcode is the same as movd, with a REX.W prefix for 64-bit operand-size. ML64 apparently always uses movd for this form, regardless of the actual operand-size.

A 64-bit load or store between an XMM register and memory can use either opcode, but the first form is shorter, not needing a REX prefix.

(AT&T syntax movq %rax, %rcx is just mov with a q operand-size suffix; in that case the q is not part of the true mnemonic.)