Why are delegates reference types?

Dan Tao picture Dan Tao · Oct 26, 2011 · Viewed 11.7k times · Source

Quick note on the accepted answer: I disagree with a small part of Jeffrey's answer, namely the point that since Delegate had to be a reference type, it follows that all delegates are reference types. (It simply isn't true that a multi-level inheritance chain rules out value types; all enum types, for example, inherit from System.Enum, which in turn inherits from System.ValueType, which inherits from System.Object, all reference types.) However I think the fact that, fundamentally, all delegates in fact inherit not just from Delegate but from MulticastDelegate is the critical realization here. As Raymond points out in a comment to his answer, once you've committed to supporting multiple subscribers, there's really no point in not using a reference type for the delegate itself, given the need for an array somewhere.


See update at bottom.

It has always seemed strange to me that if I do this:

Action foo = obj.Foo;

I am creating a new Action object, every time. I'm sure the cost is minimal, but it involves allocation of memory to later be garbage collected.

Given that delegates are inherently themselves immutable, I wonder why they couldn't be value types? Then a line of code like the one above would incur nothing more than a simple assignment to a memory address on the stack*.

Even considering anonymous functions, it seems (to me) this would work. Consider the following simple example.

Action foo = () => { obj.Foo(); };

In this case foo does constitute a closure, yes. And in many cases, I imagine this does require an actual reference type (such as when local variables are closed over and are modified within the closure). But in some cases, it shouldn't. For instance in the above case, it seems that a type to support the closure could look like this: I take back my original point about this. The below really does need to be a reference type (or: it doesn't need to be, but if it's a struct it's just going to get boxed anyway). So, disregard the below code example. I leave it only to provide context for answers the specfically mention it.

struct CompilerGenerated
{
    Obj obj;

    public CompilerGenerated(Obj obj)
    {
        this.obj = obj;
    }

    public void CallFoo()
    {
        obj.Foo();
    }
}

// ...elsewhere...

// This would not require any long-term memory allocation
// if Action were a value type, since CompilerGenerated
// is also a value type.
Action foo = new CompilerGenerated(obj).CallFoo;

Does this question make sense? As I see it, there are two possible explanations:

  • Implementing delegates properly as value types would have required additional work/complexity, since support for things like closures that do modify values of local variables would have required compiler-generated reference types anyway.
  • There are some other reasons why, under the hood, delegates simply can't be implemented as value types.

In the end, I'm not losing any sleep over this; it's just something I've been curious about for a little while.


Update: In response to Ani's comment, I see why the CompilerGenerated type in my above example might as well be a reference type, since if a delegate is going to comprise a function pointer and an object pointer it'll need a reference type anyway (at least for anonymous functions using closures, since even if you introduced an additional generic type parameter—e.g., Action<TCaller>—this wouldn't cover types that can't be named!). However, all this does is kind of make me regret bringing the question of compiler-generated types for closures into the discussion at all! My main question is about delegates, i.e., the thing with the function pointer and the object pointer. It still seems to me that could be a value type.

In other words, even if this...

Action foo = () => { obj.Foo(); };

...requires the creation of one reference type object (to support the closure, and give the delegate something to reference), why does it require the creation of two (the closure-supporting object plus the Action delegate)?

*Yes, yes, implementation detail, I know! All I really mean is short-term memory storage.

Answer

Jeffrey Sax picture Jeffrey Sax · Oct 26, 2011

The question boils down to this: the CLI (Common Language Infrastructure) specification says that delegates are reference types. Why is this so?

One reason is clearly visible in the .NET Framework today. In the original design, there were two kinds of delegates: normal delegates and "multicast" delegates, which could have more than one target in their invocation list. The MulticastDelegate class inherits from Delegate. Since you can't inherit from a value type, Delegate had to be a reference type.

In the end, all actual delegates ended up being multicast delegates, but at that stage in the process, it was too late to merge the two classes. See this blog post about this exact topic:

We abandoned the distinction between Delegate and MulticastDelegate towards the end of V1. At that time, it would have been a massive change to merge the two classes so we didn’t do so. You should pretend that they are merged and that only MulticastDelegate exists.

In addition, delegates currently have 4-6 fields, all pointers. 16 bytes is usually considered the upper bound where saving memory still wins out over extra copying. A 64-bit MulticastDelegate takes up 48 bytes. Given this, and the fact that they were using inheritance suggests that a class was the natural choice.