slevin
slevin

Reputation: 3896

Install utf8 collation in PostgreSQL

Right now I can choose Encoding : UTF8 when creating a new DB in pgAdmin4 GUI.

But, there is no option to choose utf8_general_ci as collation or character type. When I do select * from pg_collation; I dont see any collation relevant to utf8_general_ci.

Coming from a mySQL background I am confused. Do I have to install utf8-like ( eg utf8_general_ci, utf8_unicode_ci) collation in my PostgreSQL 10 or windows10?

I just want to have the equivalent of mySQL collation utf8_general_ci to PostgreSQL.

Thank you

Upvotes: 11

Views: 13510

Answers (1)

Tometzky
Tometzky

Reputation: 23920

utf8 is an encoding (how to represent unicode characters as a series of bytes), not a collation (which character goes before which).

I think the Postgres 10 collation equivalent for utf8_general_ci (or more modern utf8_unicode_ci) is called und-x-icu - this is an undefined collation (not defined for any real world language) provided by an ICU library. This collation would sort quite reasonably characters from most languages.

ICU support is a new feature added in PostgreSQL 10, so this collation isn't available for older PostgreSQL versions or when it's disabled during compilation. Before that Postgres was using operating system provided collation support, which differs between operating systems.

Upvotes: 12

Related Questions