What's the cost of typeid?

David picture David · Nov 11, 2011 · Viewed 9.1k times · Source

I'm considering a type erasure setup that uses typeid to resolve the type like so...

struct BaseThing
{
    virtual ~BaseThing() = 0 {}
};

template<typename T>
struct Thing : public BaseThing
{
    T x;
};

struct A{};
struct B{};

int main() 
{
    BaseThing* pThing = new Thing<B>();
    const std::type_info& x = typeid(*pThing);

    if( x == typeid(Thing<B>))
    {
        std::cout << "pThing is a Thing<B>!\n";
        Thing<B>* pB = static_cast<Thing<B>*>(pThing);
    }
    else if( x == typeid(Thing<A>))
    {
        std::cout << "pThing is a Thing<A>!\n";
        Thing<A>* pA = static_cast<Thing<A>*>(pThing);
    }
}

I've never seen anyone else do this. The alternative would be for BaseThing to have a pure virtual GetID() which would be used to deduce the type instead of using typeid. In this situation, with only 1 level of inheritance, what's the cost of typeid vs the cost of a virtual function call? I know typeid uses the vtable somehow, but how exactly does it work?

This would be desirable instead of GetID() because it takes quite a bit of hackery to try to make sure the IDs are unique and deterministic.

Answer

Quuxplusone picture Quuxplusone · Jun 28, 2017

The alternative would be for BaseThing to have a pure virtual GetID() which would be used to deduce the type instead of using typeid. In this situation, with only 1 level of inheritance, what's the cost of typeid vs the cost of a virtual function call? I know typeid uses the vtable somehow, but how exactly does it work?

On Linux and Mac, or anything else using the Itanium C++ ABI, typeid(x) compiles into two load instructions — it simply loads the vptr (that is, the address of some vtable) from the first 8 bytes of object x, and then loads the -1th pointer from that vtable. That pointer is &typeid(x). This is one function call less expensive than calling a virtual method.

On Windows, it involves on the order of four load instructions and a couple of (negligible) ALU ops, because the Microsoft C++ ABI is a bit more enterprisey. (source) This might end up being on par with a virtual method call, honestly. But that's still dirt cheap compared to a dynamic_cast.

A dynamic_cast involves a function call into the C++ runtime, which has a lot of loads and conditional branches and such.

So yes, exploiting typeid will be much much faster than dynamic_cast. Will it be correct for your use-case?— that's questionable. (See the other answers about Liskov substitutability and such.) But will it be fast?— yes.

Here, I took the toy benchmark code from Vaughn's highly-rated answer and made it into an actual benchmark, avoiding the obvious loop-hoisting optimization that borked all his timings. Result, for libc++abi on my Macbook:

$ g++ test.cc -lbenchmark -std=c++14; ./a.out
Run on (4 X 2400 MHz CPU s)
2017-06-27 20:44:12
Benchmark                   Time           CPU Iterations
---------------------------------------------------------
bench_dynamic_cast      70407 ns      70355 ns       9712
bench_typeid            31205 ns      31185 ns      21877
bench_id_method         30453 ns      29956 ns      25039

$ g++ test.cc -lbenchmark -std=c++14 -O3; ./a.out
Run on (4 X 2400 MHz CPU s)
2017-06-27 20:44:27
Benchmark                   Time           CPU Iterations
---------------------------------------------------------
bench_dynamic_cast      57613 ns      57591 ns      11441
bench_typeid            12930 ns      12844 ns      56370
bench_id_method         20942 ns      20585 ns      33965

(Lower ns is better. You can ignore the latter two columns: "CPU" just shows that it's spending all its time running and no time waiting, and "Iterations" is just the number of runs it took to get a good margin of error.)

You can see that typeid thrashes dynamic_cast even at -O0, but when you turn on optimizations, it does even better — because the compiler can optimize any code that you write. All that ugly code hidden inside libc++abi's __dynamic_cast function can't be optimized by the compiler any more than it already has been, so turning on -O3 didn't help much.