mr12086
mr12086

Reputation: 1147

latin-1 to utf-8 database

I have a database that is uft8_general_ci, only problem is up until this point an application coded by a previous developer seems to have been working with the database in latin-1.

I have since changed how the app uses the database and can now store an umlaut as an umlaut instead of ü. The problem now is that the application reads the previously existing data from the database as (example) 'Süddeutsche' instead of 'Süddeutsche'.

Is there anyway to convert the data inside the database from one format to the other?

Regards

Edit:

ALTER TABLE TableName MODIFY ColumnName ColumnType CHARACTER SET latin1;
ALTER TABLE TableName MODIFY ColumnName ColumnType CHARACTER SET binary;
ALTER TABLE TableName MODIFY ColumnName ColumnType CHARACTER SET utf8;

This worked for me.

Upvotes: 10

Views: 4370

Answers (4)

Said Torres
Said Torres

Reputation: 653

Here is a SQL function that I used in my db to cast all data codification from latin1 to UTF8:

DELIMITER //

CREATE PROCEDURE UpdateColumnEncoding(IN db_name VARCHAR(255))
BEGIN
    DECLARE done INT DEFAULT FALSE;
    DECLARE tableName VARCHAR(255);
    DECLARE columnName VARCHAR(255);
    DECLARE cur CURSOR FOR 
        SELECT TABLE_NAME, COLUMN_NAME
        FROM information_schema.COLUMNS
        WHERE TABLE_SCHEMA = db_name 
          AND DATA_TYPE IN ('char', 'varchar', 'text', 'tinytext', 'mediumtext', 'longtext');
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

    OPEN cur;

    read_loop: LOOP
        FETCH cur INTO tableName, columnName;
        IF done THEN
            LEAVE read_loop;
        END IF;

        SET @s = CONCAT('UPDATE `', tableName, '` SET `', columnName, '` = CONVERT(CAST(CONVERT(`', columnName, '` USING latin1) AS BINARY) USING UTF8)');
        PREPARE stmt FROM @s;
        EXECUTE stmt;
        DEALLOCATE PREPARE stmt;
    END LOOP;

    CLOSE cur;
END //

DELIMITER ;

CALL UpdateColumnEncoding('your_database_name');

Upvotes: 1

mr12086
mr12086

Reputation: 1147

http://blog.hno3.org/2010/04/22/fixing-double-encoded-utf-8-data-in-mysql/

Using what is listed here has fixed all my problems. I used this with my live data and have had no issues!

Upvotes: 1

silly
silly

Reputation: 7887

try

ALTER DATABASE your_db DEFAULT CHARACTER SET = 'utf8' COLLATE 'utf8_unicode_ci';

and

ALTER TABLE a CONVERT TO CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci';

ALTER TABLE b CONVERT TO CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci';
ALTER TABLE c CONVERT TO CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci';

don't forget to replace the 'ß':

UPDATE a SET field_1 = REPLACE(field_1, 'ß', 'ss') WHERE label LIKE '%ß%';

Upvotes: 1

bardiir
bardiir

Reputation: 14782

You could try SET NAMES to let the Database talk in latin-1 with your application while storing in utf-8 or you will need to convert all previous Datasets to utf-8-Strings

Upvotes: 2

Related Questions