Reputation: 11082
I have a MySQL Database (myDB
; ~2GB in size) with 4 Tables (tab1
, tab2
, tab3
, tab4
). Currently, the data that is stored in the tables was added using the charset ISO-8859-1
(i.e. Latin-1
).
I'd like to convert the data in all tables to UTF-8
and use UTF-8
as default charset of the tables/database/columns.
On https://blogs.harvard.edu/djcp/2010/01/convert-mysql-database-from-latin1-to-utf8-the-right-way/ I found an interesting approach:
mysqldump myDB | sed -i 's/CHARSET=latin1/CHARSET=utf8/g' | iconv -f latin1 -t utf8 | mysql myDB2
I haven't tried it yet, but are there any caveats?
Is there a way to do it directly in the MySQL shell?
[EDIT:]
Result of SHOW CREATE TABLE messages;
after running ALTER TABLE messages CONVERT TO CHARACTER SET utf8mb4;
CREATE TABLE `messages` (
`number` int(11) NOT NULL AUTO_INCREMENT,
`status` enum('0','1','2') NOT NULL DEFAULT '1',
`user` varchar(30) NOT NULL DEFAULT '',
`comment` varchar(250) NOT NULL DEFAULT '',
`text` mediumtext NOT NULL,
`date` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`number`),
KEY `index_user_status_date` (`user`,`status`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=3285217 DEFAULT CHARSET=utf8mb4
Upvotes: 2
Views: 4782
Reputation: 142298
It is possible to convert the tables. But then you need to convert the application, too.
ALTER TABLE tab1 CONVERT TO utf8mb4;
etc.
To check, do SHOW CREATE TABLE tab1
; it should show you CHARACTER SET utf8mb4
.
Note: There are 3 things going on:
VARCHAR
and TEXT
columns.CHARACTER SET
for such columns.DEFAULT CHARACTER SET
for the table -- this comes into play if you add any new columns without specifying a charset.The application...
When you connect from a client to MySQL, you need to tell it, in a app-specific way or via SET NAMES
, the encoding of the bytes in the client. This does not have to be the same as the column declarations; conversion will occur during INSERT
and SELECT
, if necessary.
I recommend you take a backup and/or test a copy of one of the tables. Be sure to go all the way through -- insert, select, display, etc.
Upvotes: 3