ARM Assembly: Absolute Value Function: Are two or three lines faster?

Ken W picture Ken W · May 11, 2013 · Viewed 8k times · Source

In my embedded systems class, we were asked to re-code the given C-function AbsVal into ARM Assembly. We were told that the best we could do was 3-lines. I was determined to find a 2-line solution and eventually did, but the question I have now is whether I actually decreased performance or increased it.

The C-code:

unsigned long absval(signed long x){
    unsigned long int signext;
    signext = (x >= 0) ? 0 : -1; //This can be done with an ASR instruction
    return (x + signet) ^ signext;
}

The TA/Professor's 3-line solution

ASR R1, R0, #31         ; R1 <- (x >= 0) ? 0 : -1
ADD R0, R0, R1          ; R0 <- R0 + R1
EOR R0, R0, R1          ; R0 <- R0 ^ R1

My 2-line solution

ADD R1, R0, R0, ASR #31 ; R1 <- x  + (x >= 0) ? 0 : -1
EOR R0, R1, R0, ASR #31 ; R0 <- R1 ^ (x >= 0) ? 0 : -1

There are a couple of places I can see potential performance differences:

  1. The addition of one extra Arithmetic Shift Right call
  2. The removal of one memory fetch

So, which one is actually faster? Does it depend upon the processor or memory access speed?

Answer

Nils Pipenbrinck picture Nils Pipenbrinck · May 13, 2013

Here is a nother two instruction version:

    cmp     r0, #0
    rsblt   r0, r0, #0

Which translate to the simple code:

  if (r0 < 0)
  {
    r0 = 0-r0;
  }

That code should be pretty fast, even on modern ARM-CPU cores like the Cortex-A8 and A9.