What are the key differences between Apache Thrift, Google Protocol Buffers, MessagePack, ASN.1 and Apache Avro?

andreypopp picture andreypopp · Jan 8, 2011 · Viewed 36.7k times · Source

All of these provide binary serialization, RPC frameworks and IDL. I'm interested in key differences between them and characteristics (performance, ease of use, programming languages support).

If you know any other similar technologies, please mention it in an answer.

Answer

JUST MY correct OPINION picture JUST MY correct OPINION · Jan 8, 2011

ASN.1 is an ISO/ISE standard. It has a very readable source language and a variety of back-ends, both binary and human-readable. Being an international standard (and an old one at that!) the source language is a bit kitchen-sinkish (in about the same way that the Atlantic Ocean is a bit wet) but it is extremely well-specified and has decent amount of support. (You can probably find an ASN.1 library for any language you name if you dig hard enough, and if not there are good C language libraries available that you can use in FFIs.) It is, being a standardized language, obsessively documented and has a few good tutorials available as well.

Thrift is not a standard. It is originally from Facebook and was later open-sourced and is currently a top level Apache project. It is not well-documented -- especially tutorial levels -- and to my (admittedly brief) glance doesn't appear to add anything that other, previous efforts don't already do (and in some cases better). To be fair to it, it has a rather impressive number of languages it supports out of the box including a few of the higher-profile non-mainstream ones. The IDL is also vaguely C-like.

Protocol Buffers is not a standard. It is a Google product that is being released to the wider community. It is a bit limited in terms of languages supported out of the box (it only supports C++, Python and Java) but it does have a lot of third-party support for other languages (of highly variable quality). Google does pretty much all of their work using Protocol Buffers, so it is a battle-tested, battle-hardened protocol (albeit not as battle-hardened as ASN.1 is. It has much better documentation than does Thrift, but, being a Google product, it is highly likely to be unstable (in the sense of ever-changing, not in the sense of unreliable). The IDL is also C-like.

All of the above systems use a schema defined in some kind of IDL to generate code for a target language that is then used in encoding and decoding. Avro does not. Avro's typing is dynamic and its schema data is used at runtime directly both to encode and decode (which has some obvious costs in processing, but also some obvious benefits vis a vis dynamic languages and a lack of a need for tagging types, etc.). Its schema uses JSON which makes supporting Avro in a new language a bit easier to manage if there's already a JSON library. Again, as with most wheel-reinventing protocol description systems, Avro is also not standardized.

Personally, despite my love/hate relationship with it, I'd probably use ASN.1 for most RPC and message transmission purposes, although it doesn't really have an RPC stack (you'd have to make one, but IOCs make that simple enough).