Relationship between catalog, schema, user, and database instance

ruruskyi picture ruruskyi · Oct 30, 2011 · Viewed 42.9k times · Source

To compare databases of different vendors (Oracle, SQL Server, DB2, MySQL, and PostgreSQL) how can I identify any object uniquely and do I need a catalog? For instance, In Java's DatabaseMetadata I should specify catalog and schema fooPattern at least.

Is it true that catalog is just an abstraction of data storage?

Answer

filiprem picture filiprem · Oct 30, 2011

In Oracle:

  • server instance == database == catalog == all data managed by same execution engine
  • schema == namespace within database, identical to user account
  • user == schema owner == named account, identical to schema, who can connect to database, who owns the schema and use objects possibly in other schemas
  • to identify any object in running server, you need (schema name + object name)

In PostgreSQL:

  • server instance == db cluster == all data managed by same execution engine
  • database == catalog == single database within db cluster, isolated from other databases in same db cluster
  • schema == namespace within database
  • user == named account, who can connect to database, own and use objects in each allowed database separately
  • to identify any object in running server, you need (database name + schema name + object name)

In MySQL:

  • server instance == not identified with catalog, just a set of databases
  • database == schema == catalog == a namespace within the server.
  • user == named account, who can connect to server and use (but can not own - no concept of ownership) objects in one or more databases
  • to identify any object in running server, you need (database name + object name)

In Microsoft SQL Server:

  • server instance == set of managed databases
  • database == namespace qualifier within the server, rarely referred to as catalog
  • schema == owner == namespace within the database, tied to database roles, by default only dbo is used
  • user == named account, who can connect to server and use (but can not own - schema works as owner) objects in one or more databases
  • to identify any object in running server, you need (database name + owner + object name)

So I think answer to your questions is:

  1. It depends on implementation, whether catalog name is needed to identify objects. The meaning of "catalog", "schema" and "database" vary from one implementation to another.

  2. Yes, a catalog is an abstraction of data storage. I think it should be also defined as a self-contained isolated namespace, but not all SQL engines do it.

  3. Database and schema are pretty well defined by all vendors. Catalog is sometimes synonymous to "database" (at least in Oracle and Postgres), sometimes synonymous to "schema", and sometimes synonymous to both. The term catalog also often means metadata collection (aka system tables).