Reputation: 8283
I am normalizing a MySQL database. I designed a new structure.
How do I move the database data from the old structure to the new structure using SQL?
Upvotes: 5
Views: 11919
Reputation: 745
I have recently done this and have some insights to how the general procedure can be executed.
Start by modelling your data. When you start with a database that is not normalized you need to create a proper model which you want to transfer your data to. This includes identifying atomic objects that should live in its own tables. Identify duplicated data and determine where that should go. Also identify all relationships that exists on you data structure.
An optional step. The database usually go together with an interface that probably also needs updating. Look at that design also in this step and decide if there are any isolated parts that can wait, both in the data structure and interface program. How much should be included is determined by practical aspects such as time and budget. Maybe some part does not need modification just yet.
It can also be an option to start completely from scratch, skip backwards compatibility and let there be two parallel systems
Write a script that adds all new columns and tables that the normalized data requires.
Write another script that transfers the non-normalized data to the new normalized data structure. This is the most tricky part I would say, and can be rather messy, depending on in how bad shape the old data is.
Enforce all constraints from the model on the new normalized data by adding constraints to the new tables and columns. This is also best done in a script. Her you will see if your data migration succeeded. If it did, you will be able to add all the constraints. If it failed, some constraint will fail, and you will have to go back and look at what failed.
Finally, make yet another script that deletes all columns and tables that were removed in the new model. By doing this, you will easily identify all places in the interface which needs updating. Anything talking to anything in these columns and tables will have to be updated in the interface.
Some general tips is to do all development against a, maybe reduced, copy of the database. E.g. in MySQL you can do an SQL Dump using for example Workbench and test you scripts on that. You will probably need a few iterations on the database before the migration works. In connection, also do the actual migration on a copy of the database, not to break anything in production.
Upvotes: 0
Reputation: 2556
Here is an Example of Normalizing Tables in a Script. I advise you do something like this
e.g Table: tbl_tmpData
Date, ProductName, ProductCode, ProductType, MarketDescription, Units, Value
2010-01-01, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 1', 20, 20.00
2010-01-02, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 2', 40, 40.00
2010-01-03, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 3', 40, 40.00
2010-01-01, 'Cola', '02', 'Drink', 'Store 1', 40, 80.00
2010-01-02, 'Cola', '02', 'Drink', 'Store 2', 20, 40.00
2010-01-03, 'Cola', '02', 'Drink', 'Store 2', 60, 120.00
2010-01-01, 'Simiri Gum', '03', 'Gum', 'Store 1', 40, 80.00
2010-01-02, 'Simiri Gum', '03', 'Gum', 'Store 2', 20, 40.00
2010-01-03, 'Simiri Gum', '03', 'Gum', 'Store 3', 60, 120.00
You would Create Your Date Table first:
CREATE TABLE tbl_Date
(
DateID int PRIMARY KEY IDENTITY(1,1)
,DateValue datetime
)
INSERT INTO tbl_Date (DateValue)
SELECT DISTINCT Date
FROM tbl_Data
WHERE Date NOT IN (SELECT DISTINCT DateValue FROM tbl_Date)
you would then Create your Market Table
CREATE TABLE tbl_Market
(
MarketID int PRIMARY KEY IDENTITY(1,1)
,MarketName varchar(200)
)
INSERT INTO tbl_Market (MarketName)
SELECT DISTINCT MarketDescription
FROM tbl_tmpData
WHERE MarketName NOT IN (SELECT DISTINCT MarketDescription FROM tbl_Market)
you would then Create your ProductType Table
CREATE TABLE tbl_ProductType
(
ProductTypeID int PRIMARY KEY IDENTITY(1,1)
,ProductType varchar(200)
)
INSERT INTO tbl_ProductType (ProductType)
SELECT DISTINCT ProductType
FROM tbl_tmpData
WHERE ProductType NOT IN (SELECT DISTINCT ProductType FROM tbl_ProductType)
you would then Create your Product Table
CREATE TABLE tbl_Product
(
ProductID int PRIMARY KEY IDENTITY(1,1)
, ProductCode varchar(100)
, ProductDescription varchar(300)
,ProductType int
)
INSERT INTO tbl_Product (ProductCode, ProductDescription, ProductType)
SELECT DISTINCT tmp.ProductCode,tmp.ProductName, pt.ProductType
FROM tbl_tmpData tmp
INNER JOIN tbl_ProductType pt ON tmp.ProductType = pt.ProductType
WHERE ProductCode NOT IN (SELECT DISTINCT ProductCode FROM tbl_Product)
you would then Create your Data Table
CREATE TABLE tbl_Data
(
DataID int PRIMARY KEY IDENTITY(1,1)
, DateID varchar(100)
, ProductID varchar(100)
, MarketID varchar(300)
,Units decimal(10,5)
, value decimal(10,5)
)
INSERT INTO tbl_Data (ProductID, MarketID, Units, Value)
SELECT t.DateID
, p.ProductID
, m.MarketID
, SUM(tmp.Units)
, SUM(tmp.VALUE)
FROM tbl_tmpData tmp
INNER JOIN tbl_Date t ON tmp.Date = t.DateValue
INNER JOIN tbl_Product p ON tmp.ProductCode = p.ProductCode
INNER JOIN tbl_Market m ON tmp.MarketDescription = m.MarketName
GROUP BY t.DateID, p.ProductID, m.MarketID
ORDER BY t.DateID, p.ProductID, m.MarketID
Upvotes: 10