Floating point division vs floating point multiplication

Question 1

Floating point division vs floating point multiplication

c++ floating-point micro-optimization

sum1stolemyname · Nov 8, 2010 · Viewed 63.6k times · Source

Answer

Answer

Yes, many CPUs can perform multiplication in 1 or 2 clock cycles but division always takes longer (although FP division is sometimes faster than integer division).

If you look at this answer you will see that division can exceed 24 cycles.

Why does division take so much longer than multiplication? If you remember back to grade school, you may recall that multiplication can essentially be performed with many simultaneous additions. Division requires iterative subtraction that cannot be performed simultaneously so it takes longer. In fact, some FP units speed up division by performing a reciprocal approximation and multiplying by that. It isn't quite as accurate but is somewhat faster.

Question 2

Is there any (non-microoptimization) performance gain by coding

float f1 = 200f / 2

in comparision to

float f2 = 200f * 0.5

A professor of mine told me a few years ago that floating point divisions were slower than floating point multiplications without elaborating the why.

Does this statement hold for modern PC architecture?

Update1

In respect to a comment, please do also consider this case:

float f1;
float f2 = 2
float f3 = 3;
for( i =0 ; i < 1e8; i++)
{
  f1 = (i * f2 + i / f3) * 0.5; //or divide by 2.0f, respectively
}

Update 2 Quoting from the comments:

[I want] to know what are the algorithmic / architectural requirements that cause > division to be vastly more complicated in hardware than multiplication

Floating point division vs floating point multiplication

Answer

Related questions