Difference between storing an ObjectId and its string form, in MongoDB

B T picture B T · Jan 12, 2015 · Viewed 21k times · Source

I'm a little confused by Mongo DB's use of ObjectIds. Sure they're great for creating IDs client-side that almost definitely don't conflict with other client-side created ids. But mongo seems to store them in some special way. Storing a string representation of the id is different from storing an object id as an object. Why is this?

Doesn't the string form have all the same information that the object form has? Why does mongo go to such lengths to differentiate those two forms? It screws me up when I try to compare _ids sent from a frontend for example. My database is in no way consistent with whether it stores string-form ids or object-form ids, and tho my code is certainly partially to blame, I mostly blame mongo for making this so weird.

Am I wrong that this is weird? Why does mongo do it this way?

Answer

Sammaye picture Sammaye · Jan 12, 2015

I, personally, blame your code. I get around this pefectly fine in my applications by coding the right way. I convert to string in code to compare and I ensure that anything that looks like an ObjectId is actually used as a ObjectId.

It is good to note that between the ObjectId (http://docs.mongodb.org/manual/reference/object-id/) and it's hex representation there is in fact 12 bytes of difference, the ObjectId being 12 bytes and it's hex representation being 24.

Not only is it about storage efficiency but also about indexes; not just because they are smaller but also since the ObjectId can be used in a special manner to ensure that only parts of the index are loaded; the parts that are used. This becomes most noticeable when inserting, where only the latest part of that index needs to be loaded in to ensure uniqueness. You cannot guarantee such behaviour with its hex representation.

I would strongly recommend you do not use the OjbectId's hex representation. If you want to "make your life easier" you would be better off creating a different _id which is smaller but somehow just as unique and index friendly.