I’m using a log-based class in C++ to store very small floating-point values (as the values otherwise go beyond the scope of double
). As I’m performing a large number of multiplications, this has the added benefit of converting the multiplications to sums.
However, at a certain point in my algorithm, I need to divide a standard double
value by an integer
value and than do a *=
to a log-based value. I have overloaded the *=
operator for my log-based class and the right-hand side value is first converted to a log-based value by running log()
and than added to the left-hand side value.
Thus the operations actually performed are floating-point division, log()
and floating-point summation.
My question whether it would be faster to first convert the denominator to a log-based value, which would replace the floating-point division with floating-point subtraction, yielding the following chain of operations: twice log()
, floating-point subtraction, floating-point summation.
In the end, this boils down to whether floating-point division is faster or slower than log()
. I suspect that a common answer would be that this is compiler and architecture dependent, so I’ll say that I use gcc 4.2 from Apple on darwin 10.3.0. Still, I hope to get an answer with a general remark on the speed of these two operators and/or an idea on how to measure the difference myself, as there might be more going on here, e.g. executing the constructors that do the type conversion etc.
Cheers!
Do you divide by the same integer multiple times? If so you can instead multiply by 1./yourInteger
, and only do the divide once. That would be faster than either if possible.
As to your actual question, it's not only compiler and architecture dependent, but also micro-architecture and data dependent.
On your particular platform (darwin/x86), for current hardware i5/i7: ~24 cycles for divide(1), ~35 cycles for log( )
(2). However, because divide only uses a single instruction dispatch slot, the hardware's reorder engine can do other useful computation while the divide is in flight; log( )
is implemented in software, by contrast, and so there is less opportunity for the processor to hoist other computations into the latency of the logarithm. This means that in practice, divide will often be a good bit faster.
1) From the Intel Optimization Manual
2) Measured by calling log( )
in a tight loop and using mach_absolute_time( )
to get wall time.