Range query for MongoDB pagination

Question 1

Range query for MongoDB pagination

javascript mongodb pagination database

user1082754 · Jan 6, 2014 · Viewed 33.7k times · Source

Answer

Answer

It is perfectly fine to use ObjectId() though your syntax for pagination is wrong. You want:

 db.tweets.find().limit(50).sort({"_id":-1});

This says you want tweets sorted by _id value in descending order and you want the most recent 50. Your problem is the fact that pagination is tricky when the current result set is changing - so rather than using skip for the next page, you want to make note of the smallest _id in the result set (the 50th most recent _id value and then get the next page with:

 db.tweets.find( {_id : { "$lt" : <50th _id> } } ).limit(50).sort({"_id":-1});

This will give you the next "most recent" tweets, without new incoming tweets messing up your pagination back through time.

There is absolutely no need to worry about whether _id value is strictly corresponding to insertion order - it will be 99.999% close enough, and no one actually cares on the sub-second level which tweet came first - you might even notice Twitter frequently displays tweets out of order, it's just not that critical.

If it is critical, then you would have to use the same technique but with "tweet date" where that date would have to be a timestamp, rather than just a date.

Question 2

I want to implement pagination on top of a MongoDB. For my range query, I thought about using ObjectIDs:

db.tweets.find({ _id: { $lt: maxID } }, { limit: 50 })

However, according to the docs, the structure of the ObjectID means that "ObjectId values do not represent a strict insertion order":

The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process.

I then thought about querying with a timestamp:

db.tweets.find({ created: { $lt: maxDate } }, { limit: 50 })

However, there is no guarantee the date will be unique — it's quite likely that two documents could be created within the same second. This means documents could be missed when paging.

Is there any sort of ranged query that would provide me with more stability?

Range query for MongoDB pagination

Answer

Related questions