tracyfan_1
tracyfan_1

Reputation: 15

SQL converting column from Latin1 to UTF8

I am trying to convert a specific column in a table on my DB from latin1 character set with collation latin1_swedish_ci to utf8 with collation utf8_unicode_ci.

COLUMN: description, type: longtext, default not null

I tried the following commands on the column:

ALTER TABLE sample MODIFY description LONGBLOB NOT NULL ;

ALTER TABLE sample MODIFY description LONGTEXT CHARACTER SET utf8 NOT NULL COLLATE utf8_unicode_ci;

I also tried to alter the encoding WITHOUT changing to binary first. But the characters ended up being re-encoded incorrectly by the server.

And keep getting an error regarding some characters:

Error Code: 1366. Incorrect string value: '\x92t hav...' for column 'longDesc' at row 803

It seems like some of the character in my table aren't converting correctly.

How can I fix this issue?

Upvotes: 1

Views: 5864

Answers (1)

Rick James
Rick James

Reputation: 142208

\x92 implies that you have latin1 in the table now. The second ALTER is claiming that the bytes are in utf8 encoding. Hence, the error message.

Case 1: You need to change the LONGTEXT to utf8 because you plan to add rows with text that cannot be encoded in latin1.

For this case, ALTER TABLE sample CONVERT TO CHARACTER SET utf8; -- converts all CHAR/TEXT columns in the table.

ALTER TABLE sample MODIFY description ... CHARACTER SET utf8; -- converts the one column.

Case 2: The rest of the system is thinking utf8 and is confused by this column.

Well, I don't think it is confused. Conversions happen as needed.

Upvotes: 1

Related Questions