c switch and jump tables

Daniel Miron picture Daniel Miron · Jun 12, 2013 · Viewed 8.2k times · Source

It is my understanding that a switch statement in c/c++ will sometimes compile to a jump table. My question is, are there any thumb rules to assure that?

In my case I'm doing something like this:

enum myenum{
MY_CASE0= 0,
MY_CASE0= 1, 
.
.
.
};

switch(foo)
{
  case MY_CASE0:
  //do stuff
  break;
  case MY_CASE1:
  //do stuff
  break;
 .
 .
 .
}

I cover all the cases from 1 to n by order. Is safe to assume it will compile to a jump table? The original code was a long and messy if else statement, so at the very least I gain some readability.

Answer

Mats Petersson picture Mats Petersson · Jun 12, 2013

A good compiler can and will choose between a jump table, a chained if/else or a combination. A poorly designed compiler may not make such a choice - and may even produce very bad code for switch-blocks. But any decent compiler should produce efficient code for switch-blocks. T

he major decision factor here is that the compiler may choose if/else when the numbers are far apart [and not trivially (e.g. dividing by 2, 4, 8, 16, 256 etc) changed to a closer value], e.g.

 switch(x)
 {
    case 1:
     ...
    case 4912:
     ...
    case 11211:
     ...
    case 19102:
     ...
 }

would require a jump table of at least 19102 * 2 bytes.

On the other hand, if the numbers are close together, the compiler will typically use a jumptable.

Even if it's a if/else type of design, it will typically do a "binary search" - if we take the above example:

 if (x <= 4912)
 {
     if (x == 1)
     {
        ....
     }
     else if (x == 4912)
     {
         .... 
     }
 } else {
     if (x == 11211)
     {
         ....
     }
     else if (x == 19102)
     {
         ...
     }
 }

If we have LOTS of cases, this approach will nest quite deep, and humans will probably get lost after three or four levels of depth (bearing in mind that each if starts at some point in the MIDDLE of the range), but it reduces the number of tests by a log2(n) where n is the number of choices. It is certainly a lot more efficient than the naive approach of

if (x == first value) ... 
else if (x == second value) ... 
else if (x == third value) ... 
..
else if (x == nth value) ... 
else ... 

This can be slightly better if certain values are put at the beginning of the if-else chain, but that assumes you can determine what is the most common before running the code.

If performance is CRITICAL to your case, then you need to benchmark the two alternatives. But my guess is that just writing the code as a switch will make the code much clearer, and at the same time run at least as fast, if not faster.