Content tagging with MongoDB

theZiki picture theZiki · Dec 14, 2012 · Viewed 8.2k times · Source

I want to implement content tagging using MongoDB. In a relational database, the best approach would be to have a many-to-many relation between the content (say, "products") and tags tables. But what is best approach with NoSQL databases?

Would it be better to put every tag in a tags array of the "content" document, or put references to tags in a string?

Answer

Philipp picture Philipp · Dec 14, 2012

In most cases where you have a n:m relation in MongoDB, you should use embedding instead of referencing. So I would recommend you to have an array "tags" in each product with the tag names. I assume that looking at a single product will be the most frequent use-case in your system. This design will allow you to show the user a product with a list of tag names with a single database query.

When you need some additional meta-data about the tags which you don't want to bind to a product (like a long-text description of a tag), you could create an additional tags collection, where the name field gets an unique index for fast lookup and avoiding duplicates. When the user clicks on or hovers over a tag name, you can use an additional query to get the tag details.

A problematic case in this design is the situation when you want to delete or rename a tag. Then you have to edit every product which includes the tag. But because MongoDB doesn't know foreign keys with CASCADE ON DELETE like SQL databases, you will always have that problem when you have documents referencing one another.

Renaming tags could be made easier by storing objectIDs instead of names in the tag array of the product. But IDs have the disadvantage that they are useless for the user. You need to get the names of the tags to show a product page. That means that you have to request every single one from the tags collection, which requires an additional database query.