Pros and Cons of using MongoDB instead of MS SQL Server

theGeekster picture theGeekster · Nov 2, 2012 · Viewed 49.1k times · Source

I am new to NoSQL world and thinking of replacing my MS Sql Server database to MongoDB. My application (written in .Net C#) interacts with IP Cameras and records meta data for each image coming from Camera, into MS SQL Database. On average, i am inserting about 86400 records per day for each camera and in current database schema I have created separate table for separate Camera images, e.g. Camera_1_Images, Camera_2_Images ... Camera_N_Images. Single image record consists of simple metadata info. like AutoId, FilePath, CreationDate. To add more details to this, my application initiates separate process (.exe) for each camera and each process inserts 1 record per second in relative table in database.

I need suggestions from (MongoDB) experts on following concerns:

  1. to tell if MongoDB is good for holding such data, which eventually will be queried against time ranges (e.g. retrieve all images of a particular camera between a specified hour)? Any suggestions about Document Based schema design for my case?

  2. What should be the specs of server (CPU, RAM, Disk)? any suggestion?

  3. Should i consider Sharding/Replication for this scenario (while considering the performance in writing to synch replica sets)?

  4. Are there any benefits of using multiple databases on same machine, so that one database will hold images of current day for all cameras, and the second one will be used to archive previous day images? I am thinking on this with respect to splitting reads and writes on separate databases. Because all read requests might be served by second database and writes to first one. Will it benefit or not? If yes then any idea to ensure that both databases are synced always.

Any other suggestions are welcomed please.

Answer

Aravind Yarram picture Aravind Yarram · Nov 2, 2012

I am myself a starter on NoSQL databases. So I am answering this at the expense of potential down votes but it will be a great learning experience for me.

Before trying my best to answer your questions I should say that if MS SQL Server is working well for you then stick with it. You have not mentioned any valid reason WHY you want to use MongoDB except the fact that you learnt about it as a document oriented db. Moreover I see that you have almost the same set of meta-data you are capturing for each camera i.e. your schema is dynamic.

  • to tell if MongoDB is good for holding such data, which eventually will be queried against time ranges (e.g. retrieve all images of a particular camera between a specified hour)? Any suggestions about Document Based schema design for my case?

MongoDB being a document oriented db, is good at querying within an aggregate (you call it document). Since you already are storing each camera's data in its own table, in MongoDB you will have a separate collection created for each camera. Here is how you perform date range queries.

  • What should be the specs of server (CPU, RAM, Disk)? any suggestion?

All NoSQL data bases are built to scale-out on commodity hardware. But by the way you have asked the question, you might be thinking of improving performance by scaling-up. You can start with a reasonable machine and as the load increases, you can keep adding more servers (scaling-out). You no need to plan and buy a high end server.

  • Should i consider Sharding/Replication for this scenario (while considering the performance in writing to synch replica sets)?

MongoDB locks the entire db for a single write (but yields for other operations) and is meant for systems which have more reads than writes. So this depends upon how your system is. There are multiple ways of sharding and should be domain specific. A generic answer is not possible. However some examples can be given like sharding by geography, by branches etc.

Also read A plain english introduction to CAP Theorem

Updated with answer to the comment on sharding

According to their documentation, You should consider deploying a sharded cluster, if:

  • your data set approaches or exceeds the storage capacity of a single node in your system.
  • the size of your system’s active working set will soon exceed the capacity of the maximum amount of RAM for your system.
  • your system has a large amount of write activity, a single MongoDB instance cannot write data fast enough to meet demand, and all other approaches have not reduced contention.

So based upon the last point yes. The auto-sharding feature is built to scale writes. In that case, you have a write lock per shard, not per database. But mine is a theoretical answer. I suggest you take consultation from 10gen.com group.