What is reification?

Martijn picture Martijn · Aug 7, 2015 · Viewed 12.8k times · Source

I know that Java implements parametric polymorphism (Generics) with erasure. I understand what erasure is.

I know that C# implements parametric polymorphism with reification. I know that can make you write

public void dosomething(List<String> input) {}
public void dosomething(List<Int> input) {}

or that you can know at runtime what the type parameter of some parameterised type is, but I don't understand what it is.

  • What is a reified type?
  • What is a reified value?
  • What happens when a type/value is reified?

Answer

Theodoros Chatzigiannakis picture Theodoros Chatzigiannakis · Aug 7, 2015

Reification is the process of taking an abstract thing and creating a concrete thing.

The term reification in C# generics refers to the process by which a generic type definition and one or more generic type arguments (the abstract thing) are combined to create a new generic type (the concrete thing).

To phrase it differently, it is the process of taking the definition of List<T> and int and producing a concrete List<int> type.

To understand it further, compare the following approaches:

  • In Java generics, a generic type definition is transformed to essentially one concrete generic type shared across all allowed type argument combinations. Thus, multiple (source code level) types are mapped to one (binary level) type - but as a result, information about the type arguments of an instance is discarded in that instance (type erasure).

    1. As a side effect of this implementation technique, the only generic type arguments that are natively allowed are those types that can share the binary code of their concrete type; which means those types whose storage locations have interchangeable representations; which means reference types. Using value types as generic type arguments requires boxing them (placing them in a simple reference type wrapper).
    2. No code is duplicated in order to implement generics this way.
    3. Type information that could have been available at runtime (using reflection) is lost. This, in turn, means that specialization of a generic type (the ability to use specialized source code for any particular generic argument combination) is very restricted.
    4. This mechanism doesn't require support from the runtime environment.
    5. There are a few workarounds to retain type information that a Java program or a JVM-based language can use.
  • In C# generics, the generic type definition is maintained in memory at runtime. Whenever a new concrete type is required, the runtime environment combines the generic type definition and the type arguments and creates the new type (reification). So we get a new type for each combination of the type arguments, at runtime.

    1. This implementation technique allows any kind of type argument combination to be instantiated. Using value types as generic type arguments does not cause boxing, since these types get their own implementation. (Boxing still exists in C#, of course - but it happens in other scenarios, not this one.)
    2. Code duplication could be an issue - but in practice it isn't, because sufficiently smart implementations (this includes Microsoft .NET and Mono) can share code for some instantiations.
    3. Type information is maintained, which allows specialization to an extent, by examining type arguments using reflection. However, the degree of specialization is limited, as a result of the fact that a generic type definition is compiled before any reification happens (this is done by compiling the definition against the constraints on the type parameters - thus, the compiler has to be able "understand" the definition even in the absence of specific type arguments).
    4. This implementation technique depends heavily on runtime support and JIT-compilation (which is why you often hear that C# generics have some limitations on platforms like iOS, where dynamic code generation is restricted).
    5. In the context of C# generics, reification is done for you by the runtime environment. However, if you want to more intuitively understand the difference between a generic type definition and a concrete generic type, you can always perform a reification on your own, using the System.Type class (even if the particular generic type argument combination you're instantiating didn't appear in your source code directly).
  • In C++ templates, the template definition is maintained in memory at compile time. Whenever a new instantiation of a template type is required in the source code, the compiler combines the template definition and the template arguments and creates the new type. So we get a unique type for each combination of the template arguments, at compile time.

    1. This implementation technique allows any kind of type argument combination to be instantiated.
    2. This is known to duplicate binary code but a sufficiently smart tool-chain could still detect this and share code for some instantiations.
    3. The template definition itself is not "compiled" - only its concrete instantiations are actually compiled. This places fewer constraints on the compiler and allows a greater degree of template specialization.
    4. Since template instantiations are performed at compile time, no runtime support is needed here either.
    5. This process is lately referred to as monomorphization, especially in the Rust community. The word is used in contrast to parametric polymorphism, which is the name of the concept that generics come from.