Kate
Kate

Reputation: 732

MySQL LOAD DATA INFILE Taking 13 Hours

Is there anything I can change in the my.ini file to speed up "LOAD DATA INFILE"?

I have two MySQL 5.5 instances each of which has one identical table structured as follows:

CREATE TABLE `log_access` (
  `_id` bigint(20) NOT NULL AUTO_INCREMENT,
  `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `type_id` int(11) NOT NULL,
  `building_id` int(11) NOT NULL,
  `card_id` varchar(15) NOT NULL,
  `user_key` varchar(35) DEFAULT NULL,
  `user_name` varchar(25) DEFAULT NULL,
  `user_validation` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`_id`),
  KEY `log_access__user_key_timestamp` (`user_key`,`timestamp`)
  KEY `log_access__timestamp` (`timestamp`)
) ENGINE=MyISAM

On a daily basis I need to move the data from previous day from instance A to instance B, which consists of roughly 25 million records. At the moment I am doing the following:

  1. On instance A, generate an OUTFILE with "WHERE timestamp BETWEEN '2014-09-23 00:00:00' AND '2014-09-23 23:59:59'. This usually takes less than 2 minutes.
  2. On instance B, execute "LOAD DATA INFILE". This is the problem area as it takes about 13 hours.
  3. On instance A, delete records from the previous day. This will probably be another
  4. On instance B, run stats On instance B, truncate the table

I have also considered partitioning the tables and just exchanging the partitions. EXCHANGE PARTITION is supported as of 5.6 and I am willing to update MySQL, however, all documentation discusses exchanging between tables and I haven't been able to confirm that I would be able to do that between DB instances.

Replication between the instances, but as I have not tinkered with replication in the past and this is a time sensitive assignment I am somewhat reluctant to tread into new waters.

Any words of wisdom much appreciated.

Upvotes: 1

Views: 1237

Answers (2)

druid62
druid62

Reputation: 129

  1. LOAD DATA INFILE in perfect PK-order, INTO a table that only has the PK-definition, so no secondary indexes yet. After import, add all secondary indexes at once, with 'ALTER TABLE mytable ALGORITHM=INPLACE, LOCK=NONE, ADD KEY ...'. Consider adding back the secondary indexes on each involved box separately, so not via replication (sql_log_bin=0), to prevent replication lag.

  2. Consider using a partitioned table, as then you can run a 'LOAD DATA INFILE' per partition, in parallel. (applies to RANGE and HASH partitioning, as the separate tsv-files (one or more per partition) are easy to prepare for those)

MariaDB doesn't have the variant 'INTO mytable PARTITION (p000)' yet. You can load into a separate table first, and then exchange partitions, but MariaDB also doesn't have 'WITHOUT VALIDATION' yet.

Upvotes: 0

Benvorth
Benvorth

Reputation: 7722

CREATE the table without PRIMARY KEY and _id column and add these after LOAD DATA INFILE is complete. MySQL checks the PRIMARY KEY integrity with each INSERT, so I think you can gain a lot of performance here. With MariaDB you can disable keys, but I think this won't work on some storage engines (see here)

Not-very-nice-alternative: I found it very easy to move a MYISAM-database by just copy/move the files on disk. If you cut/paste the files and run a REPAIR TABLE. on your target machine you can do this without restarting the Server. Just make sure you copy all 3 files (.frm, .myd, .myi)

Upvotes: 1

Related Questions