How to determine ops cause getMore operations on a MongoDB server

Question 1

How to determine ops cause getMore operations on a MongoDB server

mongodb cursor replicaset

Ben Zittlau · Feb 7, 2014 · Viewed 10.5k times · Source

Answer

Answer

A getmore indicates that a "next batch of results" was fetched from an open cursor. This may indicate either a large result set where your application has to fetch several batches, or a long running operation such as replication tailing the oplog.

Given there is a large jump starting from the 26th Jan, I would consider if you changed anything in your application or usage that might correlate with this change. Perhaps you started a new backup routine, or one of your common queries now has enough data to require multiple batches.

If you want to determine the source of the increased getmore count, you can see these logged at logLevel 1 or higher.

The logLevel parameter can be set on startup or adjusted at runtime. The logs will get large quickly with the extra detail, so I would suggest increasing for a short period of time to sample getmores during one of the peak periods indicated in MMS.

To increase logLevel in the mongo shell use:

db.adminCommand( { setParameter: 1, logLevel: 1 } )

You can change back to the default logLevel with:

db.adminCommand( { setParameter: 1, logLevel: 0 } )

After increasing the logLevel, you should be able to grep your mongod log for lines with "getmore" which will be similar to:

Sun Feb  9 09:31:14.573 [conn8527] getmore local.oplog.rs query: { .. }

The text after "getmore" (and before the query) indicates the namespace (i.e. "database.collection").

The getmore queries from replication or any other applications tailing the oplog (such as MMS Backup or MongoDB Connector) will always be on the local.oplog.rs namespace.

Question 2

I am seeing the following in mongostat on the primary server in my replicate set:

insert  query update delete getmore command flushes mapped  vsize    res faults       locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn   set repl       time
     0    414    388      0    1218     444       0  24.2g  51.4g     3g      0  <redacted>:9.6%          0       0|0     0|0   225k   441k   463 mover  PRI   03:26:30
     0    469    457      0    1352     516       0  24.2g  51.4g     3g      0 <redacted>:10.6%          0       0|0     0|0   258k   498k   463 mover  PRI   03:26:31
     0    478    482      0    1430     548       0  24.2g  51.4g     3g      0 <redacted>:12.0%          0       0|0     0|0   271k   512k   463 mover  PRI   03:26:32

As you can see the getmore count is dominating any other form of operation. This is something that has just recently started happening, as can be seen in the attached graph of opcounters. From what I've read these might be from the replica set operations, but for my piece of mind I would like to find a way to confirm that. Is there anyway I can determine the source of these getmore ops?

enter image description here

UPDATE: Added a new screenshot of opcounters from MMS which includes the legend.

How to determine ops cause getMore operations on a MongoDB server

Answer

Related questions