I'm wanting to convert the output from gethrtime to milliseconds.
The obvious way to do this is to divide by 1000000. However, I'm doing this quite often and wonder if it could become a bottleneck.
Is there an optimized divide operation when dealing with numbers like 1000000?
Note: Any code must be portable. I'm using gcc and this is generally on Sparc hardware
Some quick testing using the code below... hope that is right.
#include <sys/time.h>
#include <iostream>
using namespace std;
const double NANOSECONDS_TO_MILLISECONDS = 1.0 / 1000000.0;
int main()
{
hrtime_t start;
hrtime_t tmp;
hrtime_t fin;
start = gethrtime();
tmp = (hrtime_t)(start * NANOSECONDS_TO_MILLISECONDS);
fin = gethrtime();
cout << "Method 1"
cout << "Original val: " << start << endl;
cout << "Computed: " << tmp << endl;
cout << "Time:" << fin - start << endl;
start = gethrtime();
tmp = (start / 1000000);
fin = gethrtime();
cout "Method 2"
cout << "Original val: " << start << endl;
cout << "Computed: " << tmp << endl;
cout << "Time:" << fin - start << endl;
return 0;
}
Example outputs:
Original val: 3048161553965997
Computed: 3048161553
Time:82082
Original val: 3048161556359586
Computed: 3048161556
Time:31230
Original val: 3048239663018915
Computed: 3048239663
Time:79381
Original val: 3048239665393873
Computed: 3048239665
Time:31321
Original val: 3048249874282285
Computed: 3048249874
Time:81812
Original val: 3048249876664084
Computed: 3048249876
Time:34830
If this is correct, then the multiple by reciprocal is actually slower in this case. It's probably due to using floating point math instead of fixed point math. I will just stick to integer division then which still takes hardly any time at all.
Let your compiler figure it out!
Seriously, if you're really concerned about optimizations at this level (and you shouldn't be unless it shows up in a profile), you should get used to looking at your compiler's assembly language output. You will be amazed what the compiler is doing on your behalf.
All the people who are recommending math tricks either have bad compilers or they are underestimating their compilers. For example, try compiling this function:
unsigned long div1000000(unsigned long n) {
return n / 1000000UL;
}
Compiled with gcc 4.3.3 on x86 (-O3, -fomit-frame-pointer), I get:
$ objdump -d div.o -M intel
test2.o: file format elf32-i386
Disassembly of section .text:
00000000 <div1000000>:
0: b8 83 de 1b 43 mov eax,0x431bde83
5: f7 64 24 04 mul DWORD PTR [esp+0x4]
9: c1 ea 12 shr edx,0x12
c: 89 d0 mov eax,edx
e: c3 ret
In other words, the compiler took n / 1000000UL
and turned it into (unsigned long long)(n * 0x431bde83) >> (0x12 + 32)
. Why does that work? Off the top of my head, I have no idea! But the compiler decided that it would be faster than issuing a native divide.
Moral of the story: