How to efficiently compare the sign of two floating-point values while handling negative zeros

François Beaune picture François Beaune · May 27, 2010 · Viewed 12.2k times · Source

Given two floating-point numbers, I'm looking for an efficient way to check if they have the same sign, given that if any of the two values is zero (+0.0 or -0.0), they should be considered to have the same sign.

For instance,

  • SameSign(1.0, 2.0) should return true
  • SameSign(-1.0, -2.0) should return true
  • SameSign(-1.0, 2.0) should return false
  • SameSign(0.0, 1.0) should return true
  • SameSign(0.0, -1.0) should return true
  • SameSign(-0.0, 1.0) should return true
  • SameSign(-0.0, -1.0) should return true

A naive but correct implementation of SameSign in C++ would be:

bool SameSign(float a, float b)
{
    if (fabs(a) == 0.0f || fabs(b) == 0.0f)
        return true;

    return (a >= 0.0f) == (b >= 0.0f);
}

Assuming the IEEE floating-point model, here's a variant of SameSign that compiles to branchless code (at least with with Visual C++ 2008):

bool SameSign(float a, float b)
{
    int ia = binary_cast<int>(a);
    int ib = binary_cast<int>(b);

    int az = (ia & 0x7FFFFFFF) == 0;
    int bz = (ib & 0x7FFFFFFF) == 0;
    int ab = (ia ^ ib) >= 0;

    return (az | bz | ab) != 0;
}

with binary_cast defined as follow:

template <typename Target, typename Source>
inline Target binary_cast(Source s)
{
    union
    {
        Source  m_source;
        Target  m_target;
    } u;
    u.m_source = s;
    return u.m_target;
}

I'm looking for two things:

  1. A faster, more efficient implementation of SameSign, using bit tricks, FPU tricks or even SSE intrinsics.

  2. An efficient extension of SameSign to three values.

Edit:

I've made some performance measurements on the three variants of SameSign (the two variants described in the original question, plus Stephen's one). Each function was run 200-400 times, on all consecutive pairs of values in an array of 101 floats filled at random with -1.0, -0.0, +0.0 and +1.0. Each measurement was repeated 2000 times and the minimum time was kept (to weed out all cache effects and system-induced slowdowns). The code was compiled with Visual C++ 2008 SP1 with maximum optimization and SSE2 code generation enabled. The measurements were done on a Core 2 Duo P8600 2.4 Ghz.

Here are the timings, not counting the overhead of fetching input values from the array, calling the function and retrieving the result (which amount to 6-7 clockticks):

  • Naive variant: 15 ticks
  • Bit magic variant: 13 ticks
  • Stephens's variant: 6 ticks

Answer

Stephen Canon picture Stephen Canon · May 27, 2010

If you don't need to support infinities, you can just use:

inline bool SameSign(float a, float b) {
    return a*b >= 0.0f;
}

which is actually pretty fast on most modern hardware, and is completely portable. It doesn't work properly in the (zero, infinity) case however, because zero * infinity is NaN, and the comparison will return false, regardless of the signs. It will also incur a denormal stall on some hardware when a and b are both tiny.