Configuring Locales on Linux for PostgreSQL

The_Denominater picture The_Denominater · Feb 8, 2011 · Viewed 15.4k times · Source

I'm having trouble getting a particular database set up and running. I'm trying to restore a postgreSQL dump I got from somebody else. I've tried a few methods to no avail.

  1. Straight from pg_restore

    pg_restore -C -d postgres --exit-on-error maggie_prod_20111221.dump.sql

  2. Creating the database and tablespace first

    createdb -T template0 maggieprod -E LATIN1

    SQL: CREATE TABLESPACE magdat OWNER maggie LOCATION '/somewhere/magdat';

    pg_restore -v -d template1 maggie_prod_20110121.dump.sql

Using the first method I get the following:

pg_restore: [archiver (db)] Error while PROCESSING TOC:

pg_restore: [archiver (db)] Error from TOC entry 2308; 1262 16386 DATABASE maggieprod postgres

pg_restore: [archiver (db)] could not execute query: ERROR: encoding LATIN1 does not match locale en_CA.utf8

DETAIL: The chosen LC_CTYPE setting requires encoding UTF8. Command was: CREATE DATABASE maggieprod WITH TEMPLATE = template0 ENCODING = 'LATIN1' TABLESPACE = magdat;

And using the second, when I try and create the database I get:

createdb: database creation failed: ERROR: encoding LATIN1 does not match locale en_CA.utf8

DETAIL: The chosen LC_CTYPE setting requires encoding UTF8.

So it seems to be that I cannot create a LATIN1 encoding database? Why is that? I am new to locales and encoding and don't know very much about them. I just know that the dump was made off of a LATIN1 database.

The output of locale is:

LANG=en_CA.utf8 LC_CTYPE="en_CA.utf8" LC_NUMERIC="en_CA.utf8" LC_TIME="en_CA.utf8" LC_COLLATE="en_CA.utf8" LC_MONETARY="en_CA.utf8" LC_MESSAGES="en_CA.utf8" LC_PAPER="en_CA.utf8" LC_NAME="en_CA.utf8" LC_ADDRESS="en_CA.utf8" LC_TELEPHONE="en_CA.utf8" LC_MEASUREMENT="en_CA.utf8" LC_IDENTIFICATION="en_CA.utf8" LC_ALL=

And the output of locale -a is:

C en_AG en_AG.utf8 en_AU.utf8 en_BW.utf8 en_CA.utf8 en_DK.utf8 en_GB.utf8 en_HK.utf8 en_IE.utf8 en_IN en_IN.utf8 en_NG en_NG.utf8 en_NZ.utf8 en_PH.utf8 en_SG.utf8 en_US.utf8 en_ZA.utf8 en_ZW.utf8 POSIX

I don't see LATIN1 in the second command, should I? If so, how would I go about adding it? Is it correct for me to assume that I need to change the locale on my computer? If so, is there a way to do that only for postgreSQL? Also, when I try and open the dump I see a lot of garbage characters, I am assuming this is because of the encoding, how would I look at it properly?

Thanks for any help.

Answer

Peter Eisentraut picture Peter Eisentraut · Feb 8, 2011

You need to create the database with a locale that matches the encoding, e.g.,

createdb -T template0 maggieprod -E LATIN1 --locale=en_CA

Since you don't have all locales installed, I guess you are using Debian or Ubuntu. In that case, call dpkg-reconfigure locales or install the locales-all package.

Alternatively, create the database with encoding UTF8. As long as all your clients set the client encoding correctly, it shouldn't make a difference.