How to normalize a mantissa

Tommy K picture Tommy K · Mar 2, 2015 · Viewed 39.2k times · Source

I'm trying to convert an int into a custom float, in which the user specifies the amount of bits reserved for the exp and mantissa, but I don't understand how the conversion works. My function takes in an int value and and int exp to represent the number (value * 2^exp) i.e value = 12, exp = 4, returns 192. but I don't understand the process I need to do to change these. I've been looking at this for days and playing with IEEE converter web apps but I just don't understand what the normalization process is. Like I see that its "move the binary point and adjust the exponent" but I have no idea what this means, can anyone give me an example to go off of? Also I don't understand what the exponent bias is. The only info I have is that you just add a number to your exponent but I don't understand why. I've been searching Google for an example I can understand but this just isn't making any sense to me

Answer

eigenchris picture eigenchris · Mar 2, 2015

A floating point number is normalized when we force the integer part of its mantissa to be exactly 1 and allow its fraction part to be whatever we like.

For example, if we were to take the number 13.25, which is 1101.01 in binary, 1101 would be the integer part and 01 would be the fraction part.

I could represent 13.25 as 1101.01*(2^0), but this isn't normalized because the integer part is not 1. However, we are allowed to shift the mantissa to the right one digit if we increase the exponent by 1:

  1101.01*(2^0)
= 110.101*(2^1)
= 11.0101*(2^2)
= 1.10101*(2^3)

This representation 1.10101*(2^3) is the normalized form of 13.25.


That said, we know that normalized floating point numbers will always come in the form 1.fffffff * (2^exp)

For efficiency's sake, we don't bother storing the 1 integer part in the binary representation itself, we just pretend it's there. So if we were to give your custom-made float type 5 bits for the mantissa, we would know the bits 10100 would actually stand for 1.10100.

Here is an example with the standard 23-bit mantissa:

enter image description here


As for the exponent bias, let's take a look at the standard 32-bit float format, which is broken into 3 parts: 1 sign bit, 8 exponent bits, and 23 mantissa bits:

s eeeeeeee mmmmmmmmmmmmmmmmmmmmmmm

The exponents 00000000 and 11111111 have special purposes (like representing Inf and NaN), so with 8 exponent bits, we could represent 254 different exponents, say 2^1 to 2^254, for example. But what if we want to represent 2^-3? How do we get negative exponents?

The format fixes this problem by automatically subtracting 127 from the exponent. Therefore:

  • 0000 0001 would be 1 -127 = -126
  • 0010 1101 would be 45 -127 = -82
  • 0111 1111 would be 127-127 = 0
  • 1001 0010 would be 136-127 = 9

This changes the exponent range from 2^1 ... 2^254 to 2^-126 ... 2^+127 so we can represent negative exponents.