What is more efficient a switch case or an std::map

the_drow picture the_drow · May 31, 2009 · Viewed 14.2k times · Source

I'm thinking about the tokenizer here.
Each token calls a different function inside the parser.
What is more efficient:

  • A map of std::functions/boost::functions
  • A switch case

Answer

Nicolas Dumazet picture Nicolas Dumazet · May 31, 2009

I would suggest reading switch() vs. lookup table? from Joel on Software. Particularly, this response is interesting:

" Prime example of people wasting time trying to optimize the least significant thing."

Yes and no. In a VM, you typically call tiny functions that each do very little. It's the not the call/return that hurts you as much as the preamble and clean-up routine for each function often being a significant percentage of the execution time. This has been researched to death, especially by people who've implemented threaded interpreters.

In virtual machines, lookup tables storing computed addresses to call are usually preferred to switches. (direct threading, or "label as values". directly calls the label address stored in the lookup table) That's because it allows, under certain conditions, to reduce branch misprediction, which is extremely expensive in long-pipelined CPUs (it forces to flush the pipeline). It, however, makes the code less portable.

This issue has been discussed extensively in the VM community, I would suggest you to look for scholar papers in this field if you want to read more about it. Ertl & Gregg wrote a great article on this topic in 2001, The Behavior of Efficient Virtual Machine Interpreters on Modern Architectures

But as mentioned, I'm pretty sure that these details are not relevant for your code. These are small details, and you should not focus too much on it. Python interpreter is using switches, because they think it makes the code more readable. Why don't you pick the usage you're the most comfortable with? Performance impact will be rather small, you'd better focus on code readability for now ;)

Edit: If it matters, using a hash table will always be slower than a lookup table. For a lookup table, you use enum types for your "keys", and the value is retrieved using a single indirect jump. This is a single assembly operation. O(1). A hash table lookup first requires to calculate a hash, then to retrieve the value, which is way more expensive.

Using an array where the function addresses are stored, and accessed using values of an enum is good. But using a hash table to do the same adds an important overhead

To sum up, we have:

  • cost(Hash_table) >> cost(direct_lookup_table)
  • cost(direct_lookup_table) ~= cost(switch) if your compiler translates switches into lookup tables.
  • cost(switch) >> cost(direct_lookup_table) (O(N) vs O(1)) if your compiler does not translate switches and use conditionals, but I can't think of any compiler doing this.
  • But inlined direct threading makes the code less readable.