MySQL - LOAD DATA from LOCAL INFILE - How to increase performance

Question

Greetings Support Community,

I have about 10 million+ files that I am trying to load into MySQL database using the following script:

WORKING_DIR=/tmp
FILE1="*test*"
timestamp_format="%Y-%m-%d %H:%i:%s.%x"

for i in ${WORKING_DIR}/${FILE1}
do
    if [ -f "$i" ]; then
    mysql -uroot -ptest my_database --local-infile=1<<-SQL
    SET sql_log_bin=0;
    LOAD DATA LOCAL INFILE '${i}' INTO TABLE my_table
    FIELDS TERMINATED BY ','
    OPTIONALLY ENCLOSED BY '"'
    LINES  TERMINATED BY '
'
    IGNORE 1 LINES
    (id, transaction_id, app_id, sub_id);
    SQL
    fi
done

Its an extremely slow process. After about 24 hours, I've only been able to load about 2 million records. In each file, there is one record. At this rate, this will complete in about 5 days. Is there a faster way of doing this? E.g. Should I concatenate the files before processing?

Any suggestion to improve loading this data into MySQL would be greatly appreciated.

Thanks!

tripleee · Accepted Answer

You ask (in a comment) how to concatenate your files. That would be

cat /tmp/*test1*

though apparently you actually want to omit the first line from each:

awk 'FNR>1' /tmp/*test1*

How to make your SQL version read from standard input is beyond my competence. If you can't, maybe save the output to a temporary file, and process that.

If you get "argument list too long" maybe try

find /tmp -maxdepth 1 -type f -name '*test1*' -exec awk 'FNR>1' {} +

The -maxdepth 1 says not to descend into subdirectories; take it out if that's not what you want.

The -exec with a plus might not be available on really old systems; try with \; in its place if you get a syntax error (though there can be a rather unpleasant performance penalty).

I don't see that the variables made anything clearer, easier, more readable, or more mainatainable, so I simply took them out.

MySQL - LOAD DATA from LOCAL INFILE - How to increase performance

Answers (1)

Related Questions