Jiew Meng
Jiew Meng

Reputation: 88187

MySQL very large INSERT ... SELECT or Application level batching of SELECT then INSERTS

Which is a better idea if I have very large amounts of data to process (need a SELECT) and then insert into another table (INSERT)

Would a INSERT ... SELECT hang the server? Or perhaps run out of memory? Suppose the SELECT results in a large dataset say 1 million records (but with little data, ~2+ columns. ID columns so its an INT)

UPDATE: Application is using NodeJS

Upvotes: 3

Views: 719

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520898

I consider this answer to be an addendum to Gordon's response. In the first option,

A long INSERT ... SELECT statement, or potentially a stored procedure

The entire operation is taking place in the database, so the only network overhead is in sending the query itself, which is negligible. However, in the second option,

application level SELECT 1st process creating batches of INSERT INTO table () VALUES (), (), ...

You are first retrieving all the data which you want to insert into your application layer, and then sending all of it back to the database. The network overhead here is much larger than the first option. In addition, the application layer (e.g. Java) is not very good at doing database things (but MySQL is). Handling the data in the app layer also might be error prone.

So I would go with the first option if possible.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269483

Keeping all the data processing in the database is generally the best solution, from a performance perspective.

Handling a million rows should be okay. Of course, there are other considerations: indexes and triggers, for instance. Also, if the insert is blocking other queries, than that is another issue. But in general, a million rows from a select should be okay.

Upvotes: 1

Related Questions