How to return large amount of rows from mongodb using node.js http server?

Timo picture Timo · May 11, 2012 · Viewed 13.5k times · Source

I have a user database in mongodb which I would like to export via a REST interface in JSON. The problem is that in the worst case scenario the amount of returned rows is well over 2 million.

First I tried this

var mongo = require('mongodb'),
  Server = mongo.Server,
  Db = mongo.Db;
var server = new Server('localhost', 27017, {auto_reconnect: true});
var db = new Db('tracking', server);
var http = require('http');

http.createServer(function (request, response) {
  db.collection('users', function(err, collection) {
    collection.find({}, function(err, cursor){
      cursor.toArray(function(err, items) {
        output = '{"users" : ' + JSON.stringify(items) + '}';

        response.setHeader("Content-Type", "application/json");
        response.end(output);
      });
    });
  });
}).listen(8008);
console.log('Server running at localhost:8008');

which fails when running out of memory. The example uses node-mongodb-native driver and the basic http package.

FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory

(note that in real scenario I use parameters which limit the results as needed, but this example queries them all which is the worst case scenario regardless)

The data itself is simple, like

{ "_id" : ObjectId("4f993d1c5656d3320851aadb"), "userid" : "80ec39f7-37e2-4b13-b442-6bea57472537", "user-agent" : "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322)", "ip" : "127.0.0.1", "lastupdate" : 1335442716 }

I also tried something like

while(cursor != null)
{
  cursor.nextObject(function(err, item) {
    response.write(JSON.stringify(item));
  });
}

but that ran out of memory too.

How should I proceed? There should be a way to stream the data row by row, but I haven't been able to find a suitable example for it. Paging the data is out of the question because of external application requirements. I thought of writing the data to a file and then posting it, but that leads to unwanted io.

Answer

sha0coder picture sha0coder · Dec 12, 2012

The cursor.streamRecords() method of the native MongoDB driver is deprecated, the method stream() is faster.

I have parsed a 40,000,000 row document of acatalog without problems with Mongodb + stream() + process.nextTick()