In a switch vs dictionary for a value of Func, which is faster and why?

cubetwo1729 picture cubetwo1729 · Jul 23, 2012 · Viewed 24.9k times · Source

Suppose there is the following code:

private static int DoSwitch(string arg)
{
    switch (arg)
    {
        case "a": return 0;
        case "b": return 1;
        case "c": return 2;
        case "d": return 3;
    }
    return -1;
}

private static Dictionary<string, Func<int>> dict = new Dictionary<string, Func<int>>
    {
        {"a", () => 0 },
        {"b", () => 1 },
        {"c", () => 2 },
        {"d", () => 3 },
    };

private static int DoDictionary(string arg)
{
    return dict[arg]();
}

By iterating over both methods and comparing, I am getting that the dictionary is slightly faster, even when "a", "b", "c", "d" is expanded to include more keys. Why is this so?

Does this have to do with cyclomatic complexity? Is is because the jitter compiles the return statements in the dictionary to native code only once? Is it because the lookup of the dictionary is O(1), which may not be the case for a switch statement? (These are just guesses)

Answer

KeithS picture KeithS · Jul 23, 2012

The short answer is that the switch statement executes linearly, while the dictionary executes logarithmically.

At the IL level, a small switch statement is usually implemented as a series of if-elseif statements comparing equality of the switched variable and each case. So, this statement will execute in a time linearly proportional to the number of valid options for myVar; the cases will be compared in the order they appear, and the worst-case scenario is that all the comparisons are tried and either the last one matches or none do. So, with 32 options, the worst case is that it's none of them, and the code will have made 32 comparisons to determine this.

A Dictionary, on the other hand, uses an index-optimized collection to store values. In .NET, a Dictionary is based on a Hashtable, which has effectively constant access time (the disadvantage being extremely poor space efficiency). Other options commonly used for "mapping" collections like Dictionaries include balanced tree structures like red-black trees, which provide logarithmic access (and linear space efficiency). Any of these will allow the code to find the key corresponding to the proper "case" in the collection (or determine it does not exist) much faster than a switch statement can do the same.

EDIT: Other answers and commenters have touched on this, so in the interest of completeness I will as well. The Microsoft compiler does not always compile a switch to an if/elseif as I inferred originally. It typically does so with small numbers of cases, and/or with "sparse" cases (non-incremental values, like 1, 200, 4000). With larger sets of adjacent cases, the compiler will convert the switch into a "jump table" using a CIL statement. With large sets of sparse cases, the compiler can implement a binary search to narrow the field, and then "fall through" a small number of sparse cases or implement a jump table for adjacent cases.

However, the compiler will typically choose the implementation that is the best compromise of performance and space efficiency, so it will only use a jump table for a large number of densely-packed cases. This is because a jump table requires a space in memory on the order of the range of cases it must cover, which for sparse cases is terribly inefficient memory-wise. By using a Dictionary in source code, you basically force the compiler's hand; it will do it your way, instead of compromising on performance to gain memory efficiency.

So, I would expect most cases in which either a switch statement or a Dictionary could be used in source to perform better when using a Dictionary. Large numbers of cases in switch statements are to be avoided anyway, as they are less maintainable.