What is different about C++ math.h abs() compared to my abs()

moka picture moka · May 20, 2010 · Viewed 18.2k times · Source

I am currently writing some glsl like vector math classes in C++, and I just implemented an abs() function like this:

template<class T>
static inline T abs(T _a)
{
    return _a < 0 ? -_a : _a;
}

I compared its speed to the default C++ abs from math.h like this:

clock_t begin = clock();
for(int i=0; i<10000000; ++i)
{
    float a = abs(-1.25);
};

clock_t end = clock();
unsigned long time1 = (unsigned long)((float)(end-begin) / ((float)CLOCKS_PER_SEC/1000.0));

begin = clock();
for(int i=0; i<10000000; ++i)
{
    float a  = myMath::abs(-1.25);
};
end = clock();
unsigned long time2 = (unsigned long)((float)(end-begin) / ((float)CLOCKS_PER_SEC/1000.0));

std::cout<<time1<<std::endl;
std::cout<<time2<<std::endl;

Now the default abs takes about 25ms while mine takes 60. I guess there is some low level optimisation going on. Does anybody know how math.h abs works internally? The performance difference is nothing dramatic, but I am just curious!

Answer

GManNickG picture GManNickG · May 20, 2010

Since they are the implementation, they are free to make as many assumptions as they want. They know the format of the double and can play tricks with that instead.

Likely (as in almost not even a question), your double is the binary64 format. This means the sign has it's own bit, and an absolute value is merely clearing that bit. For example, as a specialization, a compiler implementer may do the following:

template <>
double abs<double>(const double x)
{
    // breaks strict aliasing, but compiler writer knows this behavior for the platform
    uint64_t i = reinterpret_cast<const std::uint64_t&>(x);
    i &= 0x7FFFFFFFFFFFFFFFULL; // clear sign bit

    return reinterpret_cast<const double&>(i);
}

This removes branching and may run faster.