Whats the largest denormalized and normalized number?(64bit, IEE 754-1985)

Carol.Kar picture Carol.Kar · Nov 19, 2013 · Viewed 11.8k times · Source

I am struggeling with floating point arithmetic, because I really want to understand this topic!

I know that the numbers can be represented in scientific notation.

So for both numbers the exponent should look like:

Denormalized Number: 11....11 so (1+1/2 + 1/2^2 + ... + 1/2^52)*2^1023

Normalized Number: 11....11 so (1+1/2 + 1/2^2 + ... + 1/2^52)*2^1024

However, I am not sure if this is correct?

I really would appreciate your answer!

PS.: On wikipedia the number is given! However, I do not know how they came up with that...

Answer

Jeffrey Sax picture Jeffrey Sax · Dec 14, 2013

As you know, the double-precision format looks like this:

enter image description here

The key to understanding denormalized numbers is that they are not actually floating-point numbers but instead use a fixed-point micro-format using the representations that are not used in the 'normal' format.

Normal floating-point numbers are of the form: m*2^e where e is found by subtracting the bias from the exponent field above, and m is a number between 1 and 2, where the bits after the 'binary' point are given by the fraction above. The 1 in front of the binary point is not stored, because it is known to be always 1. The exponent field has a value from 1 to 2046. The values 0 (all zeroes) and 2047 (all ones) are reserved for special uses.

All ones in the exponent field means we have either an infinity or a NaN (Not-a-Number).

All zeroes means we're dealing with denormal floating-point numbers. These are still of the same form, m*2^e, but the values of m and e are derived differently. m is now a number between 0 and 1, so there is a 0 in front of the binary point instead of a 1 for normal numbers. e always has the same value: -1022. So the exponent is a constant, which is why I called it a fixed-point format earlier.

So, the largest possible values for each are:

  • Normal: (1+1/2 + 1/2^2 + ... + 1/2^52)*2^1023 = (2-2^-52)*2^1023 = 1.797...e+308
  • Denormal: (0+1/2 + 1/2^2 + ... + 1/2^52)*2^-1022 = (1-2^-52)*2^-1022 = 2.225...e-308