Firestore - how to structure a feed and follow system

Zicsus picture Zicsus · Oct 27, 2017 · Viewed 13.8k times · Source

I was using Firebase realtime database for my test social network app in which you can just follow and receive post of people you follow. A traditional social network. I structured my database something like this-

Users
--USER_ID_1
----name
----email
--USER_ID_2
----name
----email

Posts
--POST_ID_1
----image
----userid
----date
--POST_ID_2
----image
----userid
----date

Timeline
--User_ID_1
----POST_ID_2
------date
----POST_ID_1
------date

I also have another node "Content" which just contained id of the all the user post. So, if "A" followed "B" than all the post id of B where added to A's Timeline. And if B posted something than it was also added to all of its follower's timeline.

Now this was my solution for realtime database but it clearly have some scalability issues

  • if someone have 10,000 followers than a new post was added to all of the 10,000 follower's Timeline.
  • If someone have large amount of posts than every new follower received all of those posts in his Timeline.

These were some of the problems.

Now, I am thinking to shift this whole thing on firestore as its been claimed "Scalable". So how should I structure my database so that problems I faced in realtime database can be eliminated in firestore.

Answer

Alex Mamo picture Alex Mamo · Sep 3, 2018

I've seen your question a little later but I will also try to provide you the best database structure I can think of. So hope you'll find this answer useful.

I'm thinking of a schema that has there three top-level collections for users, users that a user is following and posts:

Firestore-root
   |
   --- users (collection)
   |     |
   |     --- uid (documents)
   |          |
   |          --- name: "User Name"
   |          |
   |          --- email: "[email protected]"
   |
   --- following (collection)
   |      |
   |      --- uid (document)
   |           |
   |           --- userFollowing (collection)
   |                 |
   |                 --- uid (documents)
   |                 |
   |                 --- uid (documents)
   |
   --- posts (collection)
         |
         --- uid (documents)
              |
              --- userPosts (collection)
                    |
                    --- postId (documents)
                    |     |
                    |     --- title: "Post Title"
                    |     |
                    |     --- date: September 03, 2018 at 6:16:58 PM UTC+3
                    |
                    --- postId (documents)
                          |
                          --- title: "Post Title"
                          |
                          --- date: September 03, 2018 at 6:16:58 PM UTC+3

if someone have 10,000 followers than a new post was added to all of the 10,000 follower's Timeline.

That will be no problem at all because this is the reason the collections are ment in Firestore. According to the official documentation of modeling a Cloud Firestore database:

Cloud Firestore is optimized for storing large collections of small documents.

This is the reason I have added userFollowing as a collection and not as a simple object/map that can hold other objects. Remember, the maximum size of a document according to the official documentation regarding limits and quota is 1 MiB (1,048,576 bytes). In the case of collection, there is no limitation regarding the number of documents beneath a collection. In fact, for this kind of structure is Firestore optimized for.

So having those 10,000 followers in this manner, will work perfectly fine. Furthermore, you can query the database in such a manner that will be no need to copy anything anywhere.

As you can see, the database is pretty much denormalized allowing you to query it very simple. Let's take some example but before let's create a connection to the database and get the uid of the user using the following lines of code:

FirebaseFirestore rootRef = FirebaseFirestore.getInstance();
String uid = FirebaseAuth.getInstance().getCurrentUser().getUid();

If you want to query the database to get all the users a user is following, you can use a get() call on the following reference:

CollectionReference userFollowingRef = rootRef.collection("following/" + uid + "/userFollowing");

So in this way, you can get all user objects a user is following. Having their uid's you can simply get all their posts.

Let's say you want to get on your timeline the latest three posts of every user. The key for solving this problem, when using very large data sets is to load the data in smaller chunks. I have explained in my answer from this post a recommended way in which you can paginate queries by combining query cursors with the limit() method. I also recommend you take a look at this video for a better understanding. So to get the latest three posts of every user, you should consider using this solution. So first you need to get the first 15 user objects that you are following and then based on their uid, to get their latest three posts. To get the latest three posts of a single user, please use the following query:

Query query = rootRef.collection("posts/" + uid + "/userPosts").orderBy("date", Query.Direction.DESCENDING)).limit(3);

As you are scrolling down, load other 15 user objects and get their latest three posts and so on. Beside the date you can also add other properties to your post object, like the number of likes, comments, shares and so on.

If someone have large amount of posts than every new follower received all of those posts in his Timeline.

No way. There is no need to do something like this. I have already explained above why.

Edit May 20, 2019:

Another solution to optimize the operation in which the user should see all the recent posts of everyone he follow, is to store the posts that the user should see in a document for that user.

So if we take an example, let's say facebook, you'll need to have a document containing the facebook feed for each user. However, if there is too much data that a single document can hold (1 Mib), you need to put that data in a collection, as explained above.