Mongodb base64 image vs gridfs

danielrvt picture danielrvt · Mar 24, 2013 · Viewed 11.4k times · Source

I'm using mongodb and I want to store some thumbnails in my server. What's best? Using GridFS or converting those images to base64 and store them directly inside a document.

Answer

Benjamin M picture Benjamin M · Aug 8, 2015

As always there are some (dis) advantages:

Pros:

  • Less Database requests if only the document+thumbnail is needed.
  • Less client requests. (of course you could fetch the thumbnails from GridFS, and put them within the response, but that would result in more database requests)

Neutral:

  • Storage requirements are equal

Cons:

  • You can't reuse the very same image thumbnail in another document easily, because there's no id to reference to. (For us, that's not an issue, because the server responses are gzip compressed and you can't really tell the difference between 1 and 5 equal images)

With MongoDB and NoSQL it's all about knowing your use cases!

  • If lot's of your documents share the same image, you should use GridFS and just provide links to those files, because 1. sharing data is more space efficient and 2. the client can cache the image request and just has to retrieve it once.

  • If your clients will always need the thumbnail, you maybe should consider embedding the files as base64 within the response. This is especially nice, if 1. images are not shared between documents and/or 2. images change often and caching is useless / not possible.

  • Base64 of course means more traffic on the wire, because it needs 8 bits to transfer 6 bits. i.e. 75% efficiency. This of course only affects the client-server communication, because within MongoDB you can always store your data as binary field.

  • Do you prefer more database requests (= using GridFS)? Or bigger data/document size on the wire (= embedded)?

What we did:

We use embedded thumbnails, even if we potentially have duplicate images. After activating gzip compression on the server, the server-client transfer size didn't matter anymore. But as said before, it's a tradeoff: Now we have less client requests and less database requests, but because embedding makes caching the images impossible, we now have more data on the wire.

Conclusion:

There's no one size fits all solution.