difference between conditional instructions (cmov) and jump instructions

jigglypuff picture jigglypuff · Oct 2, 2014 · Viewed 9.2k times · Source

I'm confused where to use cmov instructions and where to use jump instructions in assembly?

From performance point of view:

  • What is the difference in both of them?
  • Which one is better?

If possible, please explain their difference with an example.

Answer

Ira Baxter picture Ira Baxter · Oct 2, 2014

movcc is a so-called predicated instruction. That's fancy-speak for "this instruction executes under a condition (predicate)".

Many processors, including the x86, after doing an arithmetic operation (especially compare instructions), sets the condition code bits to indicate the status of the result of the operation.

A conditional jump instruction checks the condition code bits for a status, and if true, jumps to a designated target.

Because the jump is conditional, and the processor typically has a deep pipeline, the condition code bits may literally not ready for the jmp instruction to process when the CPU encounters the jmp instruction. The chip designers could simply wait for the pipeline to drain (often many clock cycles), and then execute the jmp, but that would make the processor slow.

Instead, most of them choose to have a branch prediction algorithm, which predicts which way a conditional jump will go. The processor can then fetch, decode, and execute the predicted branch (or not), and continue fast execution, with the proviso that if the condition code bits that finally arrive turn out to be wrong for conditional (branch mispredict), the processor undoes all work it did after the branch, and re-executes the program going down the other path.

Conditional jumps are harder for pipelined execution than normal data dependencies, because they can change which instruction should be next in the stream of instructions flowing through the pipeline. This is called a control dependency, as opposed to a data dependency (like an add where both inputs are outputs of other recent instructions).

The branch predictors turn out to be very good, because most branches tend to have bias about their direction. (The branch at the end of most loops, is going to branch back to top, typically). So most of the time the processor doesn't have to back out of wrongly predicted work.

If the direction of the branch is highly unpredictable, then the processor will guess wrong about 50% of the time, thus have to back out work. That's expensive.

OK, now, one often finds code like this:

  cmp   ...
  jcc   $
  mov   register1, register2
$: ; continue here
  ...
  ; use register1

If the branch predictor guesses right, this code is fast, no matter which way the branch goes. If it guesses wrong a lot... ouch.

Thus the conditional move instruction. This is a move that conditionally moves data, based on the condition code bits. We can rewrite the above:

  cmp   ...
  movcc  register1, register2
$: ; continue here
  ...
  ; use register1

Now we have no branch instructions, and thus no mispredicts that make the processor undo all the work. Since there is no control dependency, the following instructions need to be fetched and decoded regardless of whether the movcc acts like a mov or nop. The pipeline can stay full without predicting the condition and speculatively executing instructions that use register1. (You could build a CPU that way, but it would defeat the purpose of movcc.)

movcc converts a control dependency into a data dependency. The CPU treats it exactly like a 3-input math instruction, with the inputs being EFLAGS and its two "regular" inputs (dest register and source register-or-memory). On x86, adc is identical to cmovae (mov if CF==0) as far as how out-of-order execution tracks the dependencies: inputs are CF, and both GP registers. Output is the destination register.

For the x86, there are cmovcc, jcc, and setcc instructions for every condition combination cc. (setcc sets the destination to 0 or 1, according to the condition. So it has a data dependency on the flags, and no other input dependencies.)