Composite PRIMARY KEY enforces NOT NULL constraints on involved columns

user3422637 picture user3422637 · Oct 30, 2014 · Viewed 16k times · Source

This is one strange, unwanted behavior I encountered in Postgres: When I create a Postgres table with composite primary keys, it enforces NOT NULL constraint on each column of the composite combination.

For example,

CREATE TABLE distributors (m_id integer, x_id integer, PRIMARY KEY(m_id, x_id));

enforces NOT NULL constraint on columns m_id and x_id, which I don't want! MySQL doesn't do this. I think Oracle doesn't do it as well.

I understand that PRIMARY KEY enforces UNIQUE and NOT NULL automatically but that makes sense for single-column primary key. In a multi-column primary key table, the uniqueness is determined by the combination.

Is there any simple way of avoiding this behavior of Postgres?
If I execute this:

CREATE TABLE distributors (m_id integer, x_id integer);

I do not get any NOT NULL constraints of course.

Answer

Erwin Brandstetter picture Erwin Brandstetter · Oct 30, 2014

If you need to allow NULL values, use a UNIQUE constraint instead of a PRIMARY KEY (and add a surrogate PK column, I suggest a serial). This allows columns to be NULL:

CREATE TABLE distributor (
   distributor_id serial PRIMARY KEY
 , m_id integer
 , x_id integer
 , UNIQUE(m_id, x_id)
);

Note, however (per documentation):

For the purpose of a unique constraint, null values are not considered equal.

In your case, you could enter (1, NULL) for (m_id, x_id) any number of times without violating the constraint. Postgres never considers two NULL values equal - as per definition in the SQL standard.

If you need to treat NULL values as equal to disallow such "duplicates", I see two options:

1. Two partial indexes

In addition to the UNIQUE constraint above:

CREATE UNIQUE INDEX dist_m_uni_idx ON distributor (m_id) WHERE x_id IS NULL;
CREATE UNIQUE INDEX dist_x_uni_idx ON distributor (x_id) WHERE m_id IS NULL;

But this gets out of hands quickly with more than two columns that can be NULL.
Related:

2. A multi-column UNIQUE index on expressions

Instead of the UNIQUE constraint. We need a free default value that is never present in involved columns, like -1. Add CHECK constraints to disallow it:

CREATE TABLE distributor (
   distributor serial PRIMARY KEY
 , m_id integer
 , x_id integer
 , CHECK (m_id <> -1)
 , CHECK (x_id <> -1)
);
CREATE UNIQUE INDEX distributor_uni_idx ON distributor (COALESCE(m_id, -1)
                                                      , COALESCE(x_id, -1))

How certain RDBMS handle things isn't always a useful indicator for proper behavior. The Postgres manual hints at this:

That means even in the presence of a unique constraint it is possible to store duplicate rows that contain a null value in at least one of the constrained columns. This behavior conforms to the SQL standard, but we have heard that other SQL databases might not follow this rule. So be careful when developing applications that are intended to be portable.

Bold emphasis mine.