Reputation: 1084
I have two database tables, one in MYSQL and one in MSSQL. Both have similar data, and one is based on data from another. They're in two different databases because one is a remote system that is administered and the local system is a Drupal installation which I'm using to show the data in a more friendly manner through a custom module.
For example, I have a table of this sort of structure in MSSQL:
ID | Title | Description | Other fields I don't care about
And based on pulling data from this table I generate a table in MYSQL:
local_id | remote_id | title | description
When the module is initialized, it goes out and does a select from the MSSQL table and generates records and populates the local database. Remote_id is the ID field in the MSSQL database so we can reference the two records together.
I need to sync up this data, deleting records locally which no longer exist on the remote table and creating new records which do not exist locally, and also update all rows information.
Problem is, that sort of requires at least 2 different transactions with possible by-row transactions as well. Example:
To sync local to Remote and remove non-existent remote records:
Select remote_id from local_table;
For Each remote_id ( select ID, title, description FROM remote_table where ID = remote_id )
If record exists
UPDATE local_table WHERE remote_id = row_id
Else
DELETE FROM local_table where remote_id = row_id
Then we need at least one other transaction to get new records (I could update here too if I didn't do it in the previous loop):
Select ID, title, description from remote_table;
For each ID ( Select remote_id from local_table )
If does not exist
INSERT INTO local_table (VALUES)
So that's a lot of db activity. It would be easier if the tables were of the same type but as it is that's the only way I know how to do it. Is there a better way? Could I just pull both result sets into an associative array and compare that way and only do the transactions necessary to remove and create? I'm unsure.
Upvotes: 1
Views: 2311
Reputation: 648
there are a lot of ways to do this based on the system you house. The first assumption i am making is that you have 2 databases and you want to sync data between these 2 that is MSSQL db must pull data from MySQL and vice versa
Your approach of using associative arrays is good but what if there are 100 columns in the table? ( in your case it is not but the approach is not future proof) So to update 1 row you need to make "n" column comparisons if there are 100 rows, then there will be 100*n comparisons
Have a look at MySQL REPLACE, INSERT INTO .. ON DUPLICATE KEY clauses that might help you - i dont know if there are such clauses in MSSQL
You can do other things like - have a "last_updated" column in each database table - whenever a column in the table gets updated, this time-stamp field must be updated
This way you can tell if a row in either database table was updated ( by comparing it to your old timestamp value) and only update those rows
logic would be in these lines
to sync local to remote
foreach localrow
get the common_id of the row
get the timestamp of the row
check if a row with this common_id exists in the remote table
if no then insert
if yes then
compare timestamps between local and remote row
if local row timestamp > remote row timestamp then update remote row
Upvotes: 1
Reputation: 52665
Rather than do a row by row operations you could do set based operations. e.g.
INSERT INTO local_table (vales)
SELECT .. FROM remote_table
WHERE NOT EXISTS (Select ... FROM local_table WHERE remote_table.field = local_table.field and ...)
In order to do that you'll need to add a linked server See sp_addlinkedserver. You can create a link from SQL Server to any server listed on the page. This includes any Database that has an ODBC driver which MySQL does.
I am not aware if MySQL is capable of doing the reverse.
Upvotes: 0