toraritte
toraritte

Reputation: 8343

PostgreSQL's `initdb` fails with "invalid locale settings; check LANG and LC_* environment variables"

Already found a solution for this (see answer below), but I am not sure that it is the appropriate one; plus this may help someone else too.


Tried to set up PostgreSQL by following the documentation (18.2 Creating a Database Cluster), but got the following error on Ubuntu 18.04 (kernel: 4.15.0-22-generic):

$ initdb -D /usr/local/pgsql/data
  (...)
initdb: invalid locale settings; check LANG and LC_* environment variables 

Found a couple answers on Stackoverflow (1, 2) that were relevant, but these did not resolve the issue and the one on Serverfault suggested to restart the service, but PostgreSQL wasn't even running.

Tried passing the locale explicitly in every variation that I found on the system, but these failed too,

 3617  2018/06/07-08:36 initdb -D ~/Downloads/ --locale=en_US.utf8
 3618  2018/06/07-08:36 initdb -D ~/Downloads/ --locale=en_US.UTF8
 3621  2018/06/07-08:37 initdb -D ~/Downloads/ --locale=en_US.UTF-8
 3622  2018/06/07-08:37 initdb -D ~/Downloads/ --locale="en_US.UTF-8"
 3623  2018/06/07-08:37 initdb -D ~/Downloads/ --locale="en_US.utf8"
 3645  2018/06/07-09:24 initdb -D ~/Downloads/ --locale="en_US.utf8"

with

initdb: invalid locale name <the_option_value_above>

There was an Arch Linux forum discussion about this, but there were no solution.


2018/06/07 1214 UPDATE

I linked answers above, but perhaps wasn't explicit enough: I did look at locale -a and locale (not listing the former's output because I installed ALL of them in my attempts below):

$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

What have been tried, but did not work (and terminal has been restarted for every iteration):

TODO: https://unix.stackexchange.com/questions/294845/bash-warning-setlocale-lc-all-cannot-change-locale-en-us-utf-8

Upvotes: 12

Views: 33744

Answers (5)

obolenskaya00
obolenskaya00

Reputation: 39

for me it was an issue when I tried to upgrade postgres from 9.6 to 15. locale should be replaced with local-provider and icu locale, e.g.:

POSTGRES_INITDB_ARGS="--locale=nl_NL --encoding=UTF8" 

->

POSTGRES_INITDB_ARGS="--locale-provider=icu --icu-locale=nl_NL --encoding=UTF8

Upvotes: 0

RichardW
RichardW

Reputation: 969

Although the question does not mention Nix, the original poster linked to this issue from the Nix discourse site so I believe this is a Nix-related issue.

I ran into this issue when running under Nix shell and found the solution here after much searching. I just had to add glibcLocales to my environment. I.e. either run nix-shell -p glibcLocales or add glibcLocales to buildInputs.

Upvotes: 5

awagner
awagner

Reputation: 68

  1. Check if the locale is enabled in /etc/locale.gen. On my fresh install of Arch Linux ARM the following line was commented out:
    en_US.UTF-8 UTF-8
    
  2. Run locale-gen without any arguments. It will list all the uncommented locales as it generates them.
  3. Optional: edit /etc/locale.conf to set the system locale:
    echo "LANG=en_US.UTF-8" > /etc/locale.conf
    
    Restart the system to make all services pick up the new setting.
  4. Run your initdb command.

Upvotes: 2

Laurenz Albe
Laurenz Albe

Reputation: 248215

You can get a listing of the locales available in Linux with

locale -a

Use one of these.

You have to choose a locale that matches your encoding, for example

initdb -E UTF8 --locale=en_US.utf8

or

initdb -E LATIN9 --locale=et_EE.iso885915

As far as I know, you can install additional locales with

sudo apt-get install language-pack-XX

Upvotes: 6

toraritte
toraritte

Reputation: 8343

From this thread:

initdb -D <your_data_location> --no-locale --encoding=UTF8

where

  --locale=LOCALE       set default locale for new databases
  --no-locale           equivalent to --locale=C

There are caveats (see warning below), but an all-utf8 database can be created using template0 (see 21.3. Template Databases).

From the client (psql):

postgres=# create database test LC_COLLATE "en_US.UTF-8" LC_CTYPE "en_US.UTF-8" template template0;

Or via createdb:

createdb --lc-collate="en_US.UTF-8" --lc-ctype="en_US.UTF-8" --template="template0" test2

Check:

$ psql
psql (10.3)
Type "help" for help.

postgres=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | C           | C           | 
 template0 | postgres | UTF8     | C           | C           | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | C           | C           | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 test      | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 test2     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 

WARNING: This is probably not the correct solution and the workaround above is just that, a workaround.

Note the "Collate" and "Ctype" fields below in a database created with the above solution and this may cause issues, because "The results of comparisons between strings depends on LC_CTYPE. In practice, the most visible effect is the sort order." (see DBA StackExchange thread). This is also confirmed on the PostgreSQL mailing list (see this thread about this issue on a database in production). Probably the easiest way to solve this would be re-initializing/recreating the database.

postgres=# \l
                             List of databases
   Name    |  Owner   | Encoding | Collate | Ctype |   Access privileges   
-----------+----------+----------+---------+-------+-----------------------
 postgres  | postgres | UTF8     | C       | C     | 
 template0 | postgres | UTF8     | C       | C     | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres
 template1 | postgres | UTF8     | C       | C     | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres
(3 rows)

Upvotes: 7

Related Questions