user4695271
user4695271

Reputation:

Setting UTF-8 in MariaDB server ignores character_set_system

I have the following Dockerfile definition file for a MariaDB server:

version: "3.8"

...

services:
  database:
    command: ["mysqld", "--character-set-server=utf8mb4", "--collation-server=utf8mb4_unicode_ci"]
    container_name: mariadb
    environment:
      MARIADB_DATABASE: sample_database
      MARIADB_INITDB_SKIP_TZINFO: "true"
      MARIADB_PASSWORD_FILE: /run/secrets/mariadb_user_password
      MARIADB_ROOT_PASSWORD_FILE: /run/secrets/root_user_password
      MARIADB_USER: mariadb
      TZ: "Etc/UTC" # https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
    image: docker.io/library/mariadb:10.6-focal # https://hub.docker.com/_/mariadb/
    networks:
      - global-network
    ports:
      - "3306:3306"
    restart: on-failure
    secrets:
      - mariadb_user_password
      - root_user_password
    stdin_open: true
    tty: true
    volumes:
      - "database-volume:/var/lib/mysql"

I'm not quite versed on this topic, but if I would like to check the character- and collation-set used by the server, I usually do execute these queries:

> show variables like 'char%';
+------------------------+--------------------------+
|Variable_name           |Value                     |
+------------------------+--------------------------+
|character_set_client    |utf8mb4                   |
|character_set_connection|utf8mb4                   |
|character_set_database  |utf8mb4                   |
|character_set_filesystem|binary                    |
|character_set_results   |utf8mb4                   |
|character_set_server    |utf8mb4                   |
|character_set_system    |utf8mb3                   |
|character_sets_dir      |/usr/share/mysql/charsets/|
+------------------------+--------------------------+


> show variables like 'collation%';
+--------------------+------------------+
|Variable_name       |Value             |
+--------------------+------------------+
|collation_connection|utf8mb4_unicode_ci|
|collation_database  |utf8mb4_unicode_ci|
|collation_server    |utf8mb4_unicode_ci|
+--------------------+------------------+

My understanding is that the settings that I have: --character-set-server=utf8mb4 and --collation-server=utf8mb4_unicode_ci are the correct ones to set both, the character set and collation to UTF-8.

Now my questions is: how do I get rid of that utf8mb3 value for character_set_system? It should be set to utf8mb4 along with the other values — or that's what I think ;)

Any clues how to set this correctly?

I would like to avoid using any configuration file(s) (like my.cnf) since I'm using the stock Docker image without any modifications.

Upvotes: 3

Views: 8059

Answers (2)

Georg Richter
Georg Richter

Reputation: 7526

character_set_system specifies the character set which will be used to store identifiers and other internal information.

It is 3byte utf8, and you cannot change it unless you change sources and recompile MariaDB.

Beginning with 10.6, utf8 was mapped to utf8mb3 (and will be mapped to utf8mb4 in later versions). See MDEV-8334

Upvotes: 6

Cristian Romanescu
Cristian Romanescu

Reputation: 684

I am having a similar problem in MariaDB 10.6.5, I was trying to load a dump from AWS and had this error:

ERROR 1253 (42000) at line 26: COLLATION 'utf8mb3_general_ci' is not valid for CHARACTER SET 'utf8mb4'

My config looks like this:

MariaDB [(none)]> SHOW VARIABLES LIKE '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8mb4                    |
| character_set_connection | utf8mb4                    |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | utf8mb4                    |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8mb3                    |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

and according to this page https://mariadb.com/kb/en/old-mode/ I had to set old-mode to empty in /etc/mysql/mariadb.conf.d/50-server.cnf

old-mode=

Therefore it changed from

MariaDB [(none)]> SHOW VARIABLES LIKE '%old%';
+------------------------------------------+-----------------+
| Variable_name                            | Value           |
+------------------------------------------+-----------------+
| old_mode                                 | UTF8_IS_UTF8MB3 |
+--------------------------+---------------------------------+

to

MariaDB [(none)]> SHOW VARIABLES LIKE '%old%';
+------------------------------------------+-----------------+
| Variable_name                            | Value           |
+------------------------------------------+-----------------+
| old_mode                                 |                 |
+--------------------------+---------------------------------+

So I've managed to load the SQL dump even if character_set_system was still showing utf8mb3.

HTH somebody.

Upvotes: 6

Related Questions