Firebase data structure and url

vzhen picture vzhen · May 19, 2013 · Viewed 13.1k times · Source

I'm new in Firebase and nosql so bear with me to use reference to sql. So my question is how to structure the data in firebase?

In firebase, is that mean every "new firebase" = "new Database" or "table" in mysql?

If in my real time web app, I have users and comments. In mysql, I will create a users and a comments table then link them together.

How do I structure this in firebase?

Answer

Frank van Puffelen picture Frank van Puffelen · May 20, 2013

If you have users and comments, you could easily model it like this:

ROOT
 |
 +-- vzhen
 |     |
 |     +-- Vzhen's comment 1
 |     |
 |     +-- Vzhen's comment 2
 |
 +-- Frank van Puffelen
       |
       +-- Frank's comment 1
       |
       +-- Frank's comment 2

However it is more likely that there is a third entity, like an article, and that users are commenting on (each other's) articles.

Firebase doesn't have the concept of a foreign key, but it's easy to mimic it. If you do that, you can model the user/article/comment structure like this:

ROOT
 |
 +-- ARTICLES
 |     |
 |     +-- Text of article 1 (AID=1)
 |     |
 |     +-- Text of article 2 (AID=2)
 |
 +-- USERS
 |     |
 |     +-- vzhen (UID=1056201)
 |     |
 |     +-- Frank van Puffelen (UID=209103)
 |
 +-- COMMENTS
 |     |
 |     +-- Vzhen's comment on Article 1 (CID=1)
 |     |
 |     +-- Frank's response (CID=2)
 |     |
 |     +-- Frank's comment on article 2 (AID=2,UID=209103)
 |
 +-- ARTICLE_USER_COMMENT
       |
       +-- (AID=1,UID=1056201,CID=1)
       |
       +-- (AID=1,UID=209103,CID=2)
       |
       +-- (AID=2,UID=209103,CID=3)

This is a quite direct mapping of the way you'd model this in a relational database. The main problem with this model is the number of lookups you'll need to do to get the information you need for a single screen.

  1. Read the article itself (from the ARTICLES node)
  2. Read the information about the comments (from the ARTICLE_USER_COMMENT node)
  3. Read the content of the comments (from the COMMENTS node)

Depending on your needs, you might even need to also read the USERS node.

And keep in mind that Firebase does not have the concept of a WHERE clause that allows you to select just the elements from ARTICLE_USER_COMMENT that match a specific article, or a specific user.

In practice this way of mapping the structure is not usable. Firebase is a hierarchical data structure, so we should use the unique abilities that gives us over the more traditional relational model. For example: we don't need a ARTICLE_USER_COMMENT node, we can just keep this information directly under each article, user and comment itself.

A small snippet of this:

ROOT
 |
 +-- ARTICLES
 |     |
 |     +-- Text of article 1 (AID=1)
 |     .    |
 |     .    +-- (CID=1,UID=1056201)
 |     .    |
 |          +-- (CID=2,UID=209103)
 |
 +-- USERS
 |     |
 |     +-- vzhen (UID=1056201)
 |     .    |
 |     .    +-- (AID=1,CID=1)
 |     .    
 |
 +-- COMMENTS
       |
       +-- Vzhen's comment on Article 1 (CID=1)
       |
       +-- Frank's response (CID=2)
       |
       +-- Frank's comment on article 2 (CID=3)

You can see here, that we're spreading the information from ARTICLE_USER_COMMENT over the article and user nodes. This is denormalizing the data a bit. The result is that we'll need to update multiple nodes when a user adds a comment to an article. In the example above we'd have to add the comment itself and then the nodes to the relevant user node and article node. The advantage is that we have fewer nodes to read when we need to display the data.

If you take this denormalization to its most extreme, you end up with a data structure like this:

ROOT
 |
 +-- ARTICLES
 |     |
 |     +-- Text of article 1 (AID=1)
 |     |    |
 |     |    +-- Vzhen's comment on Article 1 (UID=1056201)
 |     |    |
 |     |    +-- Frank's response (UID=209103)
 |     |
 |     +-- Text of article 2 (AID=2)
 |          |
 |          +-- Frank's comment on Article 2 (UID=209103)
 |
 +-- USERS
       |
       +-- vzhen (UID=1056201)
       |    |
       |    +-- Vzhen's comment on Article 1 (AID=1)
       |
       +-- Frank van Puffelen (UID=209103)
            |
            +-- Frank's response (AID=1)
            |
            +-- Frank's comment on Article 2 (AID=2)

You can see that we got rid of the COMMENTS and ARTICLE_USER_COMMENT nodes in this last example. All the information about an article is now stored directly under the article node itself, including the comments on that article (with a "link" to the user who made the comment). And all the information about a user is now stored under that user's node, including the comments that user made (with a "link" to the article that the comment is about).

The only thing that is still tricky about this model is the fact that Firebase doesn't have an API to traverse such "links", so you will have to look up the user/article up yourself. This becomes a lot easier if you use the UID/AID (in this example) as the name of the node that identifies the user/article.

So that leads to our final model:

ROOT
 |
 +-- ARTICLES
 |     |
 |     +-- AID_1
 |     |    |
 |     |    +-- Text of article 1
 |     |    |
 |     |    +-- COMMENTS
 |     |         |
 |     |         +-- Vzhen's comment on Article 1 (UID=1056201)
 |     |         |
 |     |         +-- Frank's response (UID=209103)
 |     |
 |     +-- AID_2
 |          |
 |          +-- Text of article 2
 |          |
 |          +-- COMMENTS
 |               |
 |               +-- Frank's comment on Article 2 (UID=209103)
 |
 +-- USERS
       |
       +-- UID_1056201
       |    |
       |    +-- vzhen
       |    |
       |    +-- COMMENTS
       |         |
       |         +-- Vzhen's comment on Article 1 (AID=1)
       |
       +-- UID_209103
            |
            +-- Frank van Puffelen
            |
            +-- COMMENTS
                 |
                 +-- Frank's response (AID=1)
                 |
                 +-- Frank's comment on Article 2 (AID=2)

I hope this helps in understanding hierarchical data-modelling and the trade-offs involved.