What databases do the World Wide Web's biggest sites run on?

niktech picture niktech · Jul 11, 2009 · Viewed 56k times · Source

This question is meant to serve as a list of databases and their configurations that the major web sites use and would be a great reference for anyone thinking of scaling their web site to the size of Twitter, Facebook or even Google.

Please keep your answers to a minimum and be sure to cite any sources used.

EDIT:

Also, please bold both the web-site name and the database for easier scanning.

Answer

niktech picture niktech · Jul 11, 2009

Facebook.com

  • MySQL with MyRocks. Used to store user info and social activities such as likes, comments, and shares.
  • Hive (Data warehouse for Hadoop, supports tables and a variant of SQL called hiveQL). Used for "simple summarization jobs, business intelligence and machine learning and many other applications"
  • Cassandra (Multi-dimensional, distributed key-value store). Currently used for Facebook's private messaging.

Currently running 610 (soon to be 1000) Hadoop nodes in a single cluster with Hive datastore. Both Hive and Cassandra have been open-sourced by Facebook.

Facebook stats:

  • More than 200 million active users
  • More than 100 million users log on to Facebook at least once each day
  • More than 30 million users update their statuses at least once each day
  • Average user has 120 friends on the site

Sources: