Dockerized PostgreSQL: psql: FATAL: the database system is starting up

mdh picture mdh · Feb 22, 2017 · Viewed 16.4k times · Source

I am trying to build and run two Docker containers hosting PostgreSQL and Citus extension using ansible-container. I am aware that Citus provides containers, but I want to build my own.

My container.yaml looks as follows:

version: '2'

services:

  database_master:
    image: hackermd/ubuntu-trusty-python
    user: postgres
    expose:
      - 5043
    entrypoint: ['dumb-init', '--']
    command: ['/usr/bin/pg_ctlcluster', '9.6', 'master', 'start']
    links:
      - database_worker
    depends_on:
      - database_worker

  database_worker:
    image: hackermd/ubuntu-trusty-python
    user: postgres
    expose:
    - 9700
  entrypoint: ['dumb-init', '--']
  command: ['/usr/bin/pg_ctlcluster', '9.6', 'worker', 'start']

During the build process I can start and stop the cluster via pg_ctlcluster and it finishes successfully. However, when I subsequently run the containers, I get the following error:

$ docker logs ansible_database_master_1
Removed stale pid file.
Warning: connection to the database failed, disabling startup checks:
psql: FATAL:  the database system is starting up

When I build containers with command: [] and run ps aux inside the container, I see the following process:

postgres    14  1.6  0.1 307504  3480 ?        Ds   16:46   0:00 postgres: 9.6/master: startup process

I've also tried without the dumb-init entrypoint. What am I missing?

Answer

mdh picture mdh · Feb 23, 2017

The problem is related to the default shutdown method of the pg_ctl stop mode (pg_ctl gets called by pg_ctlcluster). Stopping the cluster via pg_ctlcluster with the pg_ctl option -m smart during the build process solves this problem:

pg_ctlcluster 9.6 master stop -- -m smart

The "smart" method waits for active clients to disconnect and online backups to finish before shutting down in contrast to the default "fast" method. This is explained in the documentation of pg_ctl.

In addition, the container would exit once the pg_ctlcontrol process successfully started the database cluster via postgres (pg_ctlcontrol -> pg_ctl -> postgres). To prevent this, postgres can be called directly. The container.yml file would then look as follows:

version: '2'

services:

  database_master:
    image: hackermd/ubuntu-trusty-python
    user: postgres
    expose:
      - 5043
    command: ['dumb-init', '/usr/lib/postgresql/9.6/bin/postgres', '-D', '/var/lib/postgresql/9.6/master']
    links:
      - database_worker
    depends_on:
      - database_worker

  database_worker:
    image: hackermd/ubuntu-trusty-python
    user: postgres
    expose:
      - 9700
    command: ['dumb-init', '/usr/lib/postgresql/9.6/bin/postgres', '-D', '/var/lib/postgresql/9.6/worker']