Fastest possible Javascript object serialization with Google V8

user172783 picture user172783 · Jun 2, 2011 · Viewed 13.6k times · Source

I need to serialize moderately complex objects with 1-100's of mixed type properties.

JSON was used originally, then I switched to BSON which is marginally faster.

Encoding 10000 sample objects

JSON:        1807mS
BSON:        1687mS
MessagePack: 2644mS (JS, modified for BinaryF)

I want an order of magnitude increase; it is having a ridiculously bad impact on the rest of the system.

Part of the motivation to move to BSON is the requirement to encode binary data, so JSON is (now) unsuitable. And because it simply skips the binary data present in the objects it is "cheating" in those benchmarks.

Profiled BSON performance hot-spots

  • (unavoidable?) conversion of UTF16 V8 JS strings to UTF8.
  • malloc and string ops inside the BSON library

The BSON encoder is based on the Mongo BSON library.

A native V8 binary serializer might be wonderful, yet as JSON is native and quick to serialize I fear even that might not provide the answer. Perhaps my best bet is to optimize the heck out of the BSON library or write my own plus figure out far more efficient way to pull strings out of V8. One tactic might be to add UTF16 support to BSON.

So I'm here for ideas, and perhaps a sanity check.

Edit

Added MessagePack benchmark. This was modified from the original JS to use BinaryF.

The C++ MessagePack library may offer further improvements, I may benchmark it in isolation to compare directly with the BSON library.

Answer

deft_code picture deft_code · Jun 14, 2011

For serialization / deserialization protobuf is pretty tough to beat. I don't know if you can switch out the transport protocol. But if you can protobuf should definitely be considered.

Take a look at all the answers to Protocol Buffers versus JSON or BSON.

The accepted answer chooses thrift. It is however slower than protobuf. I suspect it was chosen for ease of use (with Java) not speed. These Java benchmarks are very telling.
Of note

  • MongoDB-BSON 45042
  • protobuf 6539
  • protostuff/protobuf 3318

The benchmarks are Java, I'd imagine that you can achieve speeds near the protostuff implementation of protobuf, ie 13.5 times faster. Worst case (if for some reason Java is just better for serialization) you can do no worse the the plain unoptimized protobuf implementation which runs 6.8 times faster.