Amit S
Amit S

Reputation: 3

Incorrect string value: '\\xA0Consu...' for column even though column has utf8mb4 encoding

I am running python script to insert record in table. I am reading data from spreadsheet that contain cell with special characters.

In my script I am running following commands before executing below commands:

SET NAMES utf8mb4;
SET CHARACTER SET utf8mb4;
SET character_set_connection=utf8mb4;
SET collation_connection = utf8mb4_unicode_ci;

Here is my table structure:

CREATE TABLE `staging` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`d_date` date NOT NULL,
`m_date` date NOT NULL,
`market` varchar(10) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'US',
`cola` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
`colb` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=265 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Table Encoding: UTF8MB4 Table Collation: UTF8MB4_UNICODE_CI

Environment Variables:

SHOW VARIABLES LIKE '%collation%';

collation_connection    latin1_swedish_ci
collation_database  utf8mb4_unicode_ci
collation_server    utf8mb4_unicode_ci

SHOW VARIABLES LIKE '%char%';

character_set_client    latin1
character_set_connection    latin1
character_set_database  utf8mb4
character_set_filesystem    binary
character_set_results   latin1
character_set_server    utf8mb4
character_set_system    utf8
character_sets_dir  /usr/local/mysql-5.7.20-macos10.12-x86_64/share/charsets/

Query:

insert into staging (d_date, m_date, market, cola, colb ) 
values ('2019-07-18', '2019-07-01', 'US', 'ARCA', 'Sodius\xa0Consumer Auto');

Error:

_mysql_exceptions.OperationalError: (1366, "Incorrect string value: '\\xA0Consu...' for column 'colb' at row 1")

Expected Result is row should be inserted into the database.

Upvotes: 0

Views: 3083

Answers (1)

Rick James
Rick James

Reputation: 142278

This says that the encoding in the client is UTF-8: SET NAMES utf8mb4;

This says that the encoding in the client is latin1: \xa0.

This says that cola will be encoded UTF-8, regardless of what the client has: cola varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL.

These say that you did SET NAMES latin1:

character_set_client    latin1
character_set_connection    latin1
character_set_results   latin1

If you really want the \xA0 (hard space), then the entire text in the client must also be latin1, and you must say SET NAMES latin1 (or ... -- there are several ways to do this).

If you don't care what type of space you use, then use a simple space.

If the rest of the client is talking UTF-8, and you want a "non-breaking space", then USE \xc2a0 (the UTF-8 encoding).

Upvotes: 1

Related Questions