google protobuf maximum size

robin bajaj picture robin bajaj · Dec 7, 2015 · Viewed 26k times · Source

I have some repeating elements in my protobuf message. At runtime the length of the message could be anything - I see some questions already asked like this one - [1]: Maximum serialized Protobuf message size

  1. I have a slightly different question here. If my JMS (Java Messaging service) provider (in this case my weblogic or tibco jms server) doesn't have any size limit on the max message size, will protocol buffer compiler complain at all about the maximum message size ?
  2. Does the performance of encoding/decoding suffer horribly at large sizes (around ~10MB)..?

Answer

Kenton Varda picture Kenton Varda · Dec 9, 2015

10MB is pushing it but you'll probably be OK.

Protobuf has a hard limit of 2GB, because many implementations use 32-bit signed arithmetic. For security reasons, many implementations (especially the Google-provided ones) impose a size limit of 64MB by default, although you can increase this limit manually if you need to.

The implementation will not "slow down" with large messages per se, but the problem is that you must always parse an entire message at once before you can start using any of the content. This means the entire message must fit into RAM (keeping in mind that after parsing the in-memory message objects are much larger than the original serialized message), and even if you only care about one field you have to wait for the whole thing to parse.

Generally I recommend trying to limit yourself to 1MB as a rule of thumb. Beyond that, think about splitting the message up into multiple chunks that can be parsed independently. However, every application -- for some, 10MB is no big deal, for others 1MB is already way too large. You'll have to profile your own app to find out.

I've actually seen cases where people were happy sending messages larger than 1GB, so... it "works".

On a side note, Cap'n Proto has a very similar design to Protobuf but can support messages up to 2^64 bytes (2^32 segments of 4GB each), and it actually does allow you to read one field from the message without parsing the whole message (if it's in a file on disk, use mmap() to avoid reading the whole thing in).

(Disclosure: I'm the author of Cap'n Proto as well as most of Google's open source Protobuf code.)