James
James

Reputation: 492

Way to replace column values with new foreign key?

I have a large table (millions of rows) and I'd like to replace all the values in a VARCHAR column with an INT, which would be a foreign key pointing to a new table holding all the VARCHAR values. The VARCHAR values are unique, if that's important.

At the moment, I can efficiently create the foreign table with this:

INSERT INTO `messages` (`message`) SELECT `message` FROM `history`;

But my first basic attempt to replace the VARCHAR column is horribly inefficient and is taking seconds per row, which is obviously not practicable (the LIMIT is there for testing).

$messages = $mysqli->query('SELECT `message` FROM `history` LIMIT 0,100;');
while($row = $messages->fetch_array(MYSQLI_ASSOC)) {
    $msg = $row['message'];
    $result2 = $mysqli->query(
        'SELECT `id` FROM `messages` WHERE `message` = "'.$msg.'" LIMIT 0,1'
    );
    $row2 = $result2->fetch_array(MYSQLI_ASSOC);
    $id = $row2['id'];
    $mysqli->query(
        'UPDATE `history` SET `message`="'.$id.'" WHERE `message` = "'.$msg.'"'
    );
    $result2->free_result();
}

It feels like I should be able to do this process entirely in SQL, rather than relying on PHP, which would hopefully also be the efficient way.

Upvotes: 1

Views: 1447

Answers (2)

Bravado
Bravado

Reputation: 137

do you have to use php? because you're making: 1) one request to bring 100 rows 2) for each row, one request to bring the id of the other table and then another request to update the original table. And I assume that the original table is bigger than 100 rows.

Can't you do it directly inside the DB? if the varchar values are unique and what you want is just have them in a separate table, you can just populate the new table just as you did but specifying also the id, which will conveniently be the very same id of the original table:

INSERT INTO `messages` (`id`, `message`) SELECT `id`, `message` FROM `history`;

Then, create a new INT column that will hold the foreign key:

ALTER TABLE history ADD message_id INT;

Then, populate the column:

UPDATE history SET message_id = id;

Then remove the varchar column:

ALTER TABLE history DROP message;

You can now add the constraint to indicate that message_id is a foreign key:

ALTER TABLE history ADD CONSTRAINT fk_message_id FOREIGN KEY message_id REFERENCES messages(id);

This way you do only 2 iterations: one to populate the new table, and one to set the new column.

Upvotes: 1

GMB
GMB

Reputation: 222482

Instead of looping with PHP, you can use an UPDATE ... JOIN query to let your RDBMS do the heavy lifting at once:

UPDATE history h
INNER JOIN messages m ON m.message = h.message
SET h.message = m.id

This assumes that table messages is already fed. For better performance, since you stated that values are unique in column message, you might want to create a UNIQUE constraint on messages(messsage).

Upvotes: 4

Related Questions