How to get the sign, mantissa and exponent of a floating point number

MetallicPriest picture MetallicPriest · Mar 28, 2013 · Viewed 104.2k times · Source

I have a program, which is running on two processors, one of which does not have floating point support. So, I need to perform floating point calculations using fixed point in that processor. For that purpose, I will be using a floating point emulation library.

I need to first extract the signs, mantissas and exponents of floating point numbers on the processor which do support floating point. So, my question is how can I get the sign, mantissa and exponent of a single precision floating point number.

Following the format from this figure,

enter image description here That is what I've done so far, but except sign, neither mantissa and exponent are correct. I think, I'm missing something.

void getSME( int& s, int& m, int& e, float number )
{
    unsigned int* ptr = (unsigned int*)&number;

    s = *ptr >> 31;
    e = *ptr & 0x7f800000;
    e >>= 23;
    m = *ptr & 0x007fffff;
}

Answer

Pietro Braione picture Pietro Braione · Oct 26, 2013

My advice is to stick to rule 0 and not redo what standard libraries already do, if this is enough. Look at math.h (cmath in standard C++) and functions frexp, frexpf, frexpl, that break a floating point value (double, float, or long double) in its significand and exponent part. To extract the sign from the significand you can use signbit, also in math.h / cmath, or copysign (only C++11). Some alternatives, with slighter different semantics, are modf and ilogb/scalbn, available in C++11; http://en.cppreference.com/w/cpp/numeric/math/logb compares them, but I didn't find in the documentation how all these functions behave with +/-inf and NaNs. Finally, if you really want to use bitmasks (e.g., you desperately need to know the exact bits, and your program may have different NaNs with different representations, and you don't trust the above functions), at least make everything platform-independent by using the macros in float.h/cfloat.