How to create a database with UTF-8 collation in PostgreSQL on Windows?

Wojtek Wencel picture Wojtek Wencel · Jul 15, 2018 · Viewed 16k times · Source

I'm configuring PostgreSQL db for the Bitbucket Server on Windows. In the official guide it says that:

The database must be configured to use the UTF-8 character set.

It doesn't strictly say that you have to set collation to UTF-8, but for other atlassian procucts it's recommended so I assume that's the same case for Bitbucket Server. Exmaple from Confluence documentation:

  • Character encoding must be set to utf8 encoding.
  • Collation must also be set to utf8. Other collations, such as "C", are known to cause issues with Confluence.

This is what I have now, the problem is that it sets the collation to English_United States.1252:

CREATE DATABASE test
WITH OWNER "postgres"
ENCODING 'UTF8'
LC_COLLATE = 'american_usa'
LC_CTYPE = 'american_usa'
TEMPLATE template0;

Is setting collation to UTF-8 actually necessary and if yes, how can I do it?

Answer

benvc picture benvc · Aug 29, 2018

Assuming that you are trying to create a PosgreSQL database with US locale sort order and character classification with UTF-8 encoding on Windows, following is a modification to the code example posted in the original question that may be used to achieve that result.

CREATE DATABASE "example_db"
WITH OWNER "postgres"
ENCODING 'UTF8'
LC_COLLATE = 'en-US'
LC_CTYPE = 'en-US'
TEMPLATE template0;

One liner format for terminal copy / paste:

CREATE DATABASE "example_db" WITH OWNER "postgres" ENCODING 'UTF8' LC_COLLATE = 'en-US' LC_CTYPE = 'en-US' TEMPLATE template0;

For anyone trying to create a similar database in a Linux environment such as Ubuntu on Windows Subsystem for Linux, you can do the following (depending on the specific environment, you may need to use 'en_US.UTF8' as the locale instead):

CREATE DATABASE "example_db"
WITH OWNER "postgres"
ENCODING 'UTF8'
LC_COLLATE = 'en_US.UTF-8'
LC_CTYPE = 'en_US.UTF-8'
TEMPLATE template0;

One liner format for terminal copy / paste:

CREATE DATABASE "example_db" WITH OWNER "postgres" ENCODING 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8' TEMPLATE template0;