NovumCoder
NovumCoder

Reputation: 4637

Control INSERT and UPDATE queries with IF in SQL file

I want to speed up an import of a very large xml file into my mysql db, I decided to dump all the queries into a SQL file and execute it on console so its a bit faster.

But my problem is that I cannot simply insert all the data.

Some tables are linked, thats the reason why i cannot use the very fast LOAD DATA INFILE feature. So I want to put the queries in a SQL file, but i need some control operations.

I have a data set I need its ID (as primary key) to be able to add data in another table using that ID.

So what I tried via mysql console is this:

INSERT IGNORE INTO tableA VALUES ( A, B, C);
SET @id = LAST_INSERT_ID();
IF( @id, SELECT 1, SELECT id INTO @id FROM tableA WHERE a=A and b=B and c=C);
INSERT INTO tableB VALUES ( @id, B, C);

Obviously the IF statement doesnt work, it works only with "SELECT IF".

What im trying to do is to add a dataset into tableA using INSERT IGNORE, so duplicate errors are ignored. If it adds a new row, i get my @id with LAST_INSERT_ID(), if there is a duplicate @id is empty, but with my IF check i select THAT duplicate and put it INTO @a, so in any case i have the @id set. then i use @id to put my data into tableB so i have the correct linking.

Is there any possibility to do this workflow with IF? Since I cannot create a simple CSV to use LOAD DATA INFILE because of my linked tables for which I need to do some checks, I think generating SQL is the best.

My XML file is like 20-25GB large. My perl script works like 3 weeks to import into the database doing all the checks, but is very slow because of all the mysql queries I do, I want to put all the queries in one file and put it into the DB with one shot. If i could control my query flow I could create that big sql file instead of running millions of queries with all the checks in my perl script.

Please tell me that this is possible.

Upvotes: 1

Views: 481

Answers (1)

hrunting
hrunting

Reputation: 3947

I'm not sure you need to do that. If tableA.a, tableA.b, and tableA.c all determine tableA.id, then regardless of whether the row entry is successful or not, you should be able to do:

INSERT IGNORE INTO tableA VALUES (A, B, C);
INSERT INTO tableB SELECT id, B, C FROM tableA WHERE a=A AND b=B AND c=C;

I realize this isn't exactly the same as the queries you have posted. The big difference is that if a row is actually inserted properly into tableA (ie. no duplicate row error), then the statements above won't insert a value of 1 for the first field in tableB. If that's what you really want, then the following should work:

INSERT IGNORE INTO tableA VALUES (A, B, C);
SET @id = LAST_INSERT_ID();
INSERT INTO tableB SELECT IF(@id IS NOT NULL, 1, id)
                     FROM tableA
                    WHERE a=A AND b=B AND c=C;

And I'm assuming there that LAST_INSERT_ID() will return NULL and not the actual auto-increment value of the last successful insert. I have not verified that actual behavior.

Upvotes: 1

Related Questions