nanoseconds to milliseconds - fast division by 1000000

Matt picture Matt · Aug 13, 2009 · Viewed 39.6k times · Source

I'm wanting to convert the output from gethrtime to milliseconds.

The obvious way to do this is to divide by 1000000. However, I'm doing this quite often and wonder if it could become a bottleneck.

Is there an optimized divide operation when dealing with numbers like 1000000?

Note: Any code must be portable. I'm using gcc and this is generally on Sparc hardware

Some quick testing using the code below... hope that is right.

#include <sys/time.h>
#include <iostream>

using namespace std;

const double NANOSECONDS_TO_MILLISECONDS = 1.0 / 1000000.0;

int main()
{
    hrtime_t start;
    hrtime_t tmp;
    hrtime_t fin;

    start = gethrtime();
    tmp = (hrtime_t)(start * NANOSECONDS_TO_MILLISECONDS);
    fin = gethrtime();

    cout << "Method 1"
    cout << "Original val: " << start << endl;
    cout << "Computed: " << tmp << endl;
    cout << "Time:" << fin - start << endl;

    start = gethrtime();
    tmp = (start / 1000000);
    fin = gethrtime();

    cout "Method 2"    
    cout << "Original val: " << start << endl;
    cout << "Computed: " << tmp << endl;
    cout << "Time:" << fin - start << endl;

    return 0;
}  

Example outputs:

Original val: 3048161553965997
Computed: 3048161553
Time:82082
Original val: 3048161556359586
Computed: 3048161556
Time:31230

Original val: 3048239663018915
Computed: 3048239663
Time:79381
Original val: 3048239665393873
Computed: 3048239665
Time:31321

Original val: 3048249874282285
Computed: 3048249874
Time:81812
Original val: 3048249876664084
Computed: 3048249876
Time:34830

If this is correct, then the multiple by reciprocal is actually slower in this case. It's probably due to using floating point math instead of fixed point math. I will just stick to integer division then which still takes hardly any time at all.

Answer

Josh Haberman picture Josh Haberman · Aug 13, 2009

Let your compiler figure it out!

Seriously, if you're really concerned about optimizations at this level (and you shouldn't be unless it shows up in a profile), you should get used to looking at your compiler's assembly language output. You will be amazed what the compiler is doing on your behalf.

All the people who are recommending math tricks either have bad compilers or they are underestimating their compilers. For example, try compiling this function:

unsigned long div1000000(unsigned long n) {
  return n / 1000000UL;
}

Compiled with gcc 4.3.3 on x86 (-O3, -fomit-frame-pointer), I get:

$ objdump -d div.o -M intel

test2.o:     file format elf32-i386


Disassembly of section .text:

00000000 <div1000000>:
   0:   b8 83 de 1b 43          mov    eax,0x431bde83
   5:   f7 64 24 04             mul    DWORD PTR [esp+0x4]
   9:   c1 ea 12                shr    edx,0x12
   c:   89 d0                   mov    eax,edx
   e:   c3                      ret    

In other words, the compiler took n / 1000000UL and turned it into (unsigned long long)(n * 0x431bde83) >> (0x12 + 32). Why does that work? Off the top of my head, I have no idea! But the compiler decided that it would be faster than issuing a native divide.

Moral of the story:

  • don't optimize this unless you're sure it's a bottleneck.
  • don't do fancy arithmetic (multiplying by the reciprocal, shifts, etc) unless you already know what your compiler is doing and you think you can beat it.
  • benchmark the result -- only leave in a wart like fancy bitmath if you have demonstrated that you've outdone your compiler.