Datastore vs Cloud SQL in Google App Engine

user3376321 picture user3376321 · Mar 7, 2014 · Viewed 13.7k times · Source

I want to build an application that will serve a lot of people (more than 2 million) so I think that I should use Google Cloud Datastore. However I also know that there is an option to use Google Cloud SQL and still serve a lot of people using mySQL (like what Facebook and Youtube do).

Is this a correct assumption to use Datastore rather that the relational Cloud SQL with this many users? Thank you in advance

Answer

Robert D picture Robert D · Apr 14, 2014

To give an intelligent answer, I would need to know a lot more about your app. But... I'll outline the biggest gotchas I've found...

Google Datastore is effectively a distributed hierarchical data store. To get the scalability they wanted there had to be some compromises. As a developer you will find that these are anywhere from easy to work around, difficult to work around, or impossible to work around. The latter is far more likely than you would ever assume.

If you are accustomed to relational databases and the ability to manipulate data across multiple tables within the same transaction, you are likely to pull your hair out with datastore. The biggest(?) gotcha is that transactions are only supported across a limited number of entity groups (5 at the current time). To give a simple example, say you had a simple parent-child relationship and you needed to update child records under more than 5 parents at the same time within a transaction... can't be done (yes, really). If you reorganize your data structures and try to put all of the former child records under a single entity so they can be updated in a single transaction, you will come across another limitation... the fact that you can't reliably update the same entity group more than once per second (yes, really). And if you query an entity type across parents without specifying the root entity of each, you will get what is euphemistically referred to as "eventual consistency"... which means it isn't (yes, really).

The above is all in Google's documentation, but you are likely to gloss over it if you are just getting started (of course it can handle it!).