Reputation: 771
I have to upload about 16 million records to a MySQL 5.1 server on a shared webspace which does not permit LOAD DATA functionality. The table is an Innodb table. I have not assigned any keys yet.
Therefore, I use a Python script to convert my CSV file (of 2.5 GB of size) to an SQL file with individual INSERT statements. I've launched the SQL file, and the process is incredibly slow, it feels like 1000-1500 lines are processed every minute!
In the meantime, I read about bulk inserts, but did not find any reliable source telling how many records one insert statement can have. Do you know?
Is it an advantage to have no keys and add them later?
Would a transaction around all the insert help speed up the process? In fact, there's just a single connection (mine) working with the database at this time.
Upvotes: 0
Views: 90
Reputation: 18368
If you use insert ... values ...
syntax to insert multiple rows running a single request your query size is limited by max_allowed_packet value rather than by number of rows.
Concerning keys: it's a good practice to define keys before any data manipulations. Actually, when you build a model you must think of keys, relations, indexes etc.
It's better do define indexes before you insert data as well. CREATE INDEX
works quite slowly on huge datasets. But postponing indexes creation is not a huge disadvantage.
To make your inserts faster try to turn autocommit mode on and do not run concurrent requests on your tables.
Upvotes: 1