Radical_Activity
Radical_Activity

Reputation: 2738

Fastest way to insert/update a million rows in Laravel 5.7

I'm using Laravel 5.7 to fetch large amounts of data (around 500k rows) from an API server and insert it into a table (call it Table A) quite frequently (at least every six hours, 24/7) - however, it's enough to insert only the changes the next time we insert (but at least 60-70% of the items will change). So this table will quickly have tens of millions of rows.

I came up with the idea to make a helper table (call it Table B) to store all the new data into it. Before inserting everything into Table A, I want to compare it to the previous data (with Laravel, PHP) from Table B - so I will only insert the records that need to be updated. Again it will usually be around 60-70% of the records.

My first question is if this above-mentioned way is the preferred way of doing it, in this situation (obviously I want to make it happen as fast as possible.) I assume that searching for an updating the records in the table would take a lot more time and it would keep the table busy / lock it. Is there a better way to achieve the same (meaning to update the records in the DB).


The second issue I'm facing is the slow insert times. Right now I'm using a local environment (16GB RAM, I7-6920HQ CPU) and MySQL is inserting the rows very slowly (about 30-40 records at a time). The size of one row is around 50 bytes.

I know it can be made a lot faster by fiddling around with InnoDB's settings. However, I'd also like to think that I can do something on Laravel's side to improve performance.

Right now my Laravel code looks like this (only inserting 1 record at a time):

foreach ($response as $key => $value)
{
    DB::table('table_a')
        ->insert(
        [
            'test1' => $value['test1'],
            'test2' => $value['test2'],
            'test3' => $value['test3'],
            'test4' => $value['test4'],
            'test5' => $value['test5'],
        ]);
}

$response is a type of array.

So my second question: is there any way to increase the inserting time of the records to something like 50k/second - both on the Laravel application layer (by doing batch inserts) and MySQL InnoDB level (changing the config).

Current InnoDB settings:

innodb_buffer_pool_size        = 256M
innodb_log_file_size           = 256M
innodb_thread_concurrency      = 16
innodb_flush_log_at_trx_commit = 2
innodb_flush_method            = normal
innodb_use_native_aio = true

MySQL version is 5.7.21.

If I forgot to tell/add anything, please let me know in a comment and I will do it quickly.

Edit 1: The server that I'm planning to use will have SSD on it - if that makes any difference. I assume MySQL inserts will still count as I/O.

Upvotes: 8

Views: 23255

Answers (5)

Bauroziq
Bauroziq

Reputation: 1231

I usually handle this problem using pagination, jobs and functions are executed recursively, all looping processes per each page are carried out in a separate job recursively.

class ProcessPurchaseJob implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    protected $page;

    public function __construct($page = null)
    {
        $this->page = $page ?? 1;
    }

    public function handle()
    {
        $batchSize = 1000; // Amount of data per page

        $purchases = Purchase::orderBy('id')
            ->paginate($batchSize, ['*'], 'page', $this->page);

        foreach ($purchases as $item) {
            // Execute your purchase logic code
        }

        // If there are still next pages, call the job again recursively
        if ($users->hasMorePages()) {
            dispatch(new ProcessPurchaseJob($users->currentPage() + 1));
        }
    }
}

Then call the job anywhere

ProcessPurchaseJob::dispatch();

Upvotes: 0

Elias Soares
Elias Soares

Reputation: 10254

Disable autocommit and manually commit at end of insertion

According to MySQL 8.0 docs. (8.5.5 Bulk Data Loading for InnoDB Tables)

You can increase the INSERT speed by turning off auto commit:

  • When importing data into InnoDB, turn off autocommit mode, because it performs a log flush to disk for every insert. To disable autocommit during your import operation, surround it with SET autocommit and COMMIT statements:
    SET autocommit=0;
    ... SQL import statements ...
    COMMIT;

Other way to do it in Laravel is using Database Transactions:

DB::beginTransaction()

// Your inserts here

DB::commit()

Use INSERT with multiple VALUES

Also according to MySQL 8.0 docs (8.2.5.1 Optimizing INSERT Statements) you can optimize INSERT speed by using multiple VALUES on a single insert statement.

To do it with Laravel, you can just pass an array of values to the insert() method:

DB::table('your_table')->insert([
   [
       'column_a'=>'value',
       'column_b'=>'value',
   ],
   [
       'column_a'=>'value',
       'column_b'=>'value',
   ],
   [
       'column_a'=>'value',
       'column_b'=>'value',
   ],
]);

According to the docs, it can be many times faster.

Read the docs

Both MySQL docs links that I put on this post have tons of tips on increasing INSERT speed.

Avoid using Laravel/PHP for inserting it

If your data source is (or can be) a CSV file, you can run it a lot faster using mysqlimport to import the data.

Using PHP and Laravel to import data from a CSV file is an overhead, unless you need to do some data processing before inserting.

Upvotes: 10

N69S
N69S

Reputation: 17205

You need to do multiple row insert but also chunk your insert to not exceed your DB limits

You can do this by chunking your array

foreach (array_chunk($response, 1000) as $responseChunk)
{
    $insertableArray = [];
    foreach($responseChunk as $value) {
        $insertableArray[] = [
            'test1' => $value['test1'],
            'test2' => $value['test2'],
            'test3' => $value['test3'],
            'test4' => $value['test4'],
            'test5' => $value['test5'],
        ];
    }
    DB::table('table_a')->insert($insertableArray);
}

You can increase the size of the chunk 1000 till you approach you DB configuration limit. Make sure to leave some security margin (0.6 times your DB limit).

You can't go any faster than this using laravel.

Upvotes: 2

uday214125
uday214125

Reputation: 583

Thanks @Namoshek, I had also the same problem. solution is like this.

$users= array_chunk($data, 500, true);

foreach ($users as $key => $user) {
  Model::insert($user);
}

Depends on data, you can also make use of array_push() and then insert.

Upvotes: 6

Tharaka Dilshan
Tharaka Dilshan

Reputation: 4499

Don't call insert() inside a foreach() because it will execute n number of queries to the database when you have n number of data.

First create an array of data objects matching with the database column names. and then pass the created array to insert() function.

This will only execute one query to the database regardless of how many number of data you have.

This is way faster, way too faster.

$data_to_insert = [];

foreach ($response as $key => $value)
{
    array_push($data_to_insert, [
            'test1' => $value['test1'],
            'test2' => $value['test2'],
            'test3' => $value['test3'],
            'test4' => $value['test4'],
            'test5' => $value['test5'],
    ]);
}

DB::table('table_a')->insert($data_to_insert);

Upvotes: 6

Related Questions