The result:
If you are operating on a dataset that is fault tolerant, or doing a one time process you can verify, changing WriteAcknowledge to Unacknowledged can help.
Also, bulk operations are IsOrdered by default, which I was not aware off. Setting this to False actually makes the operation perform in bulk, otherwise it operates as one thread of updates.
MongoDB 3.0 / WiredTiger / C# Driver
I have a collection with 147,000,000 documents, of which I am performing updates each second (hopefully) of approx. 3000 documents.
Here is an example update:
"query" : {
"_id" : BinData(0,"UKnZwG54kOpT4q9CVWbf4zvdU223lrE5w/uIzXZcObQiAAAA")
},
"updateobj" : {
"$set" : {
"b" : BinData(0,"D8u1Sk/fDES4IkipZzme7j2qJ4oWjlT3hvLiAilcIhU="),
"s" : true
}
}
This is a typical update of which I my requirements are to be inserted at a rate of 3000 per second.
Unfortunately these are taking twice as long, for instance the last update was for 1723 documents, and took 1061ms.
The collection only has an index on the _id, no other indexes, and the average document size for the collection is 244 bytes, uncapped.
The server has 64GB of memory, 12 threads. Insert performance is excellent with lower collection sizes, say around 50 million, but after about 80 million really starts to drop off.
Could it be because the entire set does not sit in memory? Database is backed by RAID0 SSDs so IO performance should not become a bottleneck and if it was it should have shown this at the beginning?
Would appreciate some guidance as I'm confident MongoDB can fulfill my rather meager requirements compared to some applications it is used in. There is not a substantial read rate on the database so Sharding would not improve matters, although perhaps I am wrong.
Either way, the current insert rate is not good enough.
Update: Here is the explain() of just the query...
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "Collection",
"indexFilterSet" : false,
"parsedQuery" : {
"_id" : {
"$eq" : { "$binary" : "SxHHwTMEaOmSc9dD4ng/7ILty0Zu0qX38V81osVqWkAAAAAA", "$type" : "00" }
}
},
"winningPlan" : {
"stage" : "IDHACK"
},
"rejectedPlans" : []
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 1,
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,
"executionStages" : {
"stage" : "IDHACK",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 1,
"needTime" : 0,
"needFetch" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keysExamined" : 1,
"docsExamined" : 1
},
"allPlansExecution" : []
},
The query it self is very fast, and the update operation takes about 25ish milliseconds, they are being pushed to Mongo by use of the BulkWriter: await m_Collection.BulkWriteAsync(updates);
You can try to modify the Write concern levels. Obviously there is a risk on this, as you wouldn't be able to catch any writing error, but at least you should still be able to capture network errors. As MongoDB groups the bulk insert operations in groups of 1000, this should speed up the process.
W by default is 1:
When you change it to 0:
If you are not concern about the order of elements, you can gain some speed calling the unordered bulk operation
await m_Collection.BulkWriteAsync(updates, new BulkWriteOptions() { IsOrdered = false });
With an unordered operations list, MongoDB can execute in parallel the write operations in the list and in any order. Link