Database design for apps using "hashtags"

FaddishWorm picture FaddishWorm · Jul 17, 2014 · Viewed 14.4k times · Source

database design question here.

Say we had a webapp or something that uses hashtags for 20-40 word notes. What is the best way to store a user's hashtags.

For instance, if a user entered. "I like to have #lunch at #sizzler" we would store the sentence as text and we could store the hashtags as JSON, a comma separated list or some other mechanism.

Its also worth pointing out that the tags need to be searchable, such as how many people have been hash tagging lunch, etc.

Advise on the matter would be great, I always get a bit stumped when it comes to storing variable sized inputs in mysql. There can be an infinite number of hashtags per note, what is the best way to store them?

Answer

DrCopyPaste picture DrCopyPaste · Jul 17, 2014

I would advise going with a typical many-to-many-relationship between messages and tags.

That would mean you need 3 tables.

  • Messages (columns Id, UserId and Content)
  • Tags (columns Id and TagName)
  • TagMessageRelations (columns: MessageId and TagId - to make the connections between messages and tags - via foreign keys pointing to Messages.Id / Tags.Id)

That way you do not store a tag multiple times but only create a new relation to a message (if that tag already exists in the tag-table of course).

This way you can

  • easily count how many tags there are (SELECT COUNT(*) FROM Tags)
  • only save each tag once and search for tags can be easily indexed
  • or count how many times a certain tag was used per user - for example:

SELECT COUNT(*) FROM Tags INNER JOIN TagMessageRelations ON Tags.Id = TagMessageRelations.TagId INNER JOIN Messages ON TagMessageRelations.MessageId = Messages.Id GROUP BY Messages.UserId