Reputation: 4637
I want to speed up an import of a very large xml file into my mysql db, I decided to dump all the queries into a SQL file and execute it on console so its a bit faster.
But my problem is that I cannot simply insert all the data.
Some tables are linked, thats the reason why i cannot use the very fast LOAD DATA INFILE
feature. So I want to put the queries in a SQL file, but i need some control operations.
I have a data set I need its ID (as primary key) to be able to add data in another table using that ID.
So what I tried via mysql console is this:
INSERT IGNORE INTO tableA VALUES ( A, B, C);
SET @id = LAST_INSERT_ID();
IF( @id, SELECT 1, SELECT id INTO @id FROM tableA WHERE a=A and b=B and c=C);
INSERT INTO tableB VALUES ( @id, B, C);
Obviously the IF statement doesnt work, it works only with "SELECT IF".
What im trying to do is to add a dataset into tableA using INSERT IGNORE, so duplicate errors are ignored. If it adds a new row, i get my @id with LAST_INSERT_ID(), if there is a duplicate @id is empty, but with my IF check i select THAT duplicate and put it INTO @a, so in any case i have the @id set. then i use @id to put my data into tableB so i have the correct linking.
Is there any possibility to do this workflow with IF? Since I cannot create a simple CSV to use LOAD DATA INFILE because of my linked tables for which I need to do some checks, I think generating SQL is the best.
My XML file is like 20-25GB large. My perl script works like 3 weeks to import into the database doing all the checks, but is very slow because of all the mysql queries I do, I want to put all the queries in one file and put it into the DB with one shot. If i could control my query flow I could create that big sql file instead of running millions of queries with all the checks in my perl script.
Please tell me that this is possible.
Upvotes: 1
Views: 481
Reputation: 3947
I'm not sure you need to do that. If tableA.a
, tableA.b
, and tableA.c
all determine tableA.id
, then regardless of whether the row entry is successful or not, you should be able to do:
INSERT IGNORE INTO tableA VALUES (A, B, C);
INSERT INTO tableB SELECT id, B, C FROM tableA WHERE a=A AND b=B AND c=C;
I realize this isn't exactly the same as the queries you have posted. The big difference is that if a row is actually inserted properly into tableA
(ie. no duplicate row error), then the statements above won't insert a value of 1
for the first field in tableB
. If that's what you really want, then the following should work:
INSERT IGNORE INTO tableA VALUES (A, B, C);
SET @id = LAST_INSERT_ID();
INSERT INTO tableB SELECT IF(@id IS NOT NULL, 1, id)
FROM tableA
WHERE a=A AND b=B AND c=C;
And I'm assuming there that LAST_INSERT_ID()
will return NULL
and not the actual auto-increment value of the last successful insert. I have not verified that actual behavior.
Upvotes: 1