Union types and Intersection types

missingfaktor picture missingfaktor · Apr 13, 2011 · Viewed 9.2k times · Source

What are the various use cases for union types and intersection types? There has been lately a lot of buzz about these type system features, yet somehow I have never felt need for either of these!

Answer

Don Stewart picture Don Stewart · Apr 13, 2011

Union Types

To quote Robert Harper, "Practical Foundations for Programming Languages", ch 15:

Most data structures involve alternatives such as the distinction between a leaf and an interior node in a tree, or a choice in the outermost form of a piece of abstract syntax. Importantly, the choice determines the structure of the value. For example, nodes have children, but leaves do not, and so forth. These concepts are expressed by sum types, specifically the binary sum, which offers a choice of two things, and the nullary sum, which offers a choice of no things.

Booleans

The simplest sum type is the Boolean,

data Bool = True
          | False

Booleans have only two valid values, T or F. So instead of representing them as numbers, we can instead use a sum type to more accurately encode the fact there are only two possible values.

Enumerations

Enumerations are examples of more general sum types: ones with many, but finite, alternative values.

Sum types and null pointers

The best practically motivating example for sum types is discriminating between valid results and error values returned by functions, by distinguishing the failure case.

For example, null pointers and end-of-file characters are hackish encodings of the sum type:

data Maybe a = Nothing
             | Just a

where we can distinguish between valid and invalid values by using the Nothing or Just tag to annotate each value with its status.

By using sum types in this way we can rule out null pointer errors entirely, which is a pretty decent motivating example. Null pointers are entirely due to the inability of older languages to express sum types easily.

Intersection Types

Intersection types are much newer, and their applications are not as widely understood. However, Benjamin Pierce's thesis ("Programming with Intersection Types and Bounded Polymorphism") gives a good overview:

The most intriguing and potentially useful property of intersection types is their ability to express an essentially unbounded (though of course finite) amount of information about the components of a program.

For example, the addition function (+) can be given the type Int -> Int -> Int ^ Real -> Real -> Real, capturing both the general fact that the sum of two real numbers is always a real and the more specialized fact that the sum of two integers is always an integer. A compiler for a language with intersection types might even provide two different object-code sequences for the two versions of (+), one using a floating point addition instruction and one using integer addition. For each instance of+ in a program, the compiler can decide whether both arguments are integers and generate the more efficient object code sequence in this case.

This kind of finitary polymorphism or coherent overloading is so expressive, that ... the set of all valid typings for a program amounts to a complete characterization of the program’s behavior

They let us encode a lot of information in the type, explaining via type theory what multiple inheritance means, giving types to type classes,