What does the ProtoInclude attribute mean (in protobuf-net)

Mike Dinescu picture Mike Dinescu · Jun 4, 2009 · Viewed 12.7k times · Source

In the ProtoBuf-Net implementation, what does the ProtoInclude attribute mean, and what does it do?

An example would be appreciated.

I saw it in this post and I'm not sure what it does. The example was:

[Serializable,
 ProtoContract,
 ProtoInclude(50, typeof(BeginRequest))]
abstract internal class BaseMessage
{
  [ProtoMember(1)]
  abstract public UInt16 messageType { get; }
}

[Serializable,
 ProtoContract]
internal class BeginRequest : BaseMessage
{
    [ProtoMember(1)]
    public override UInt16 messageType
    {
        get { return 1; }
    }
}

Also, is there a way to generate such inheritance using the protogen tool?

Answer

Marc Gravell picture Marc Gravell · Jun 12, 2009

Sorry, I didn't mean to miss this one - alas, I don't see everything.

Given the specifics in the question, I'm going to assume that you are at least passingly familiar with .proto; correct me if I am wrong.

[ProtoInclude] works a lot like [XmlInclude] for XmlSerializer - or [KnownType] for DataContractSerializer - it allows it to recognise subclasses of a type during (de)serialization. The only additional thing is that it needs a tag (number) to identify each sub-type (that must be unique, and not clash with any of the fields from the parent type).

Re protogen: nope; the underlying spec (by google) makes no provision for inheritance at all, so protogen (via .proto) has no mechanism to express this. protobuf-net provides inheritance support as an extension, but does it in a way that still leaves the messages wire-compatible with the other implementations. At a push, maybe I could add protogen support via the new extension properties in the google spec, but I haven't done this yet.

So; to look at the example; that expresses an inheritance relationship between BaseMessage and BeginRequest; regardless of whether you do:

Serialize<BaseMessage>(...)
Serialize<BeginRequest>(...)
  • either way, it will start at the base (BaseMessage) and work upwards; which isn't exactly true - it writes the data starting with BeginRequest (so that it knows we have a BeginRequest as early as possible during deserialization). The important thing is that the fields from any parent contract types is included, and the serializer looks at the actual object passed in - not just the type you say it is.

Likewise, during deserilaization, regardless of whether you use:

Deserialize<BaseMessage>(...)
Deserialize<BeginRequest>(...)

you will get the type you actually serialized (presumably a BeginRequest).

Under the bonnet, for compatibility purposes (with the wide protocol buffers specification), this is similar to writing something like (forgive any errors, my .proto is rusty):

message BaseMessage {
    optional BeginRequest beginRequest = 50;
    optional uint32 messageType = 1;   
}
message BeginRequest {        
}

(the override probably shouldn't specify [ProtoMember], btw.

Normally, it would write fields in ascending tag order, but to make for efficient deserialization the engine cheekily chooses to write the subclass data first (which is explicitly allowed by the spec) - i.e. it writes something like (you'll have to imagine the binary...):

[tag 50, string][length of sub-message][body of sub-message][tag 1, int][value]

(in this case, the body of the sub-message is empty)

Does that cover it?