In what way does denormalization improve database performance?

Roman picture Roman · Feb 27, 2010 · Viewed 60.7k times · Source

I heard a lot about denormalization which was made to improve performance of certain application. But I've never tried to do anything related.

So, I'm just curious, which places in normalized DB makes performance worse or in other words, what are denormalization principles?

How can I use this technique if I need to improve performance?

Answer

Pascal MARTIN picture Pascal MARTIN · Feb 28, 2010

Denormalization is generally used to either:

  • Avoid a certain number of queries
  • Remove some joins

The basic idea of denormalization is that you'll add redundant data, or group some, to be able to get those data more easily -- at a smaller cost; which is better for performances.


A quick examples?

  • Consider a "Posts" and a "Comments" table, for a blog
    • For each Post, you'll have several lines in the "Comment" table
    • This means that to display a list of posts with the associated number of comments, you'll have to:
      • Do one query to list the posts
      • Do one query per post to count how many comments it has (Yes, those can be merged into only one, to get the number for all posts at once)
      • Which means several queries.
  • Now, if you add a "number of comments" field into the Posts table:
    • You only need one query to list the posts
    • And no need to query the Comments table: the number of comments are already de-normalized to the Posts table.
    • And only one query that returns one more field is better than more queries.

Now, there are some costs, yes:

  • First, this costs some place on both disk and in memory, as you have some redundant informations:
    • The number of comments are stored in the Posts table
    • And you can also find those number counting on the Comments table
  • Second, each time someone adds/removes a comment, you have to:
    • Save/delete the comment, of course
    • But also, update the corresponding number in the Posts table.
    • But, if your blog has a lot more people reading than writing comments, this is probably not so bad.