Reputation: 4783
I have to update really large set of data, and so far I have this:
protected function updateWidgetsDb($APIWidgets, $date)
{
echo "Eager loading...";
$widgets = Widget::where('date', $date)->get();
echo "Done\n";
echo "Updating...";
foreach ($APIWidgets as $APIWidget) {
$widget = $widgets->where('widget_id', $APIWidget->dimensions[0])->first();
if ($widget == null)
continue;
$widget->update(['revenue' => $APIWidget->metrics[0]->values[0]]);
}
echo "Done\n";
}
$APIWidgets
is an array which is fetched through external API. I am fetching data only for a specific date so I am loading only data from that date which already exists in the DB.
widget_id
field is indexed in DB
EDIT:
I have about 60k of data with about 2k data I am receiving for update. Each record of those 2k has an ID which can already be found in existing 60k data. So after update, the sum of data should still be 60k, not 62k.
Currently, update process takes ~10 minutes.
Upvotes: 1
Views: 60
Reputation: 11340
I can propose two methods to speed up a mass update task. I tried to reproduce your issue, so I created a set of 60k*7 items at my widgets
table with widget_id,date
and widget_id
indexes.
First let's make getting $widget
faster. When I replaced $widgets->where('widget_id', $APIWidget->id)
with
Widget::where('date', $date)
->where('widget_id', $APIWidget->id)
the script became ~400x times faster. Looks like 2000 mysql fetching using index widget_id, date
is faster, than Laravel collection search through 60000-size collection. My results was 16 seconds for queries and ~6400 seconds for collections.
Then let's make update faster. There is a nice approach to update table with only one query using temporary table. So I created a new entity WidgetUpdate
with table widget_updates
and fields id, widget_id, revenue_new
.
I've modified your method: first I collect an array of updates, then
I make a bulk insert into widget_updates
table and after all I
perform a single updating query. My machine does it for 2.2 seconds,
5x times faster. The final speedup is 2000x for my case.
protected function updateWidgetsDb($APIWidgets, $date)
{
echo "Updating...";
$updates = [];
foreach ($APIWidgets as $APIWidget) {
$widget = Widget::where('date', $date)
->where('widget_id', $APIWidget->dimensions[0])
->first();
if ($widget == null)
continue;
$updates[] = ['revenue_new' => $APIWidget->metrics[0]->values[0]];
}
# inserting and updating
WidgetUpdate::insert($updates);
DB::statement('UPDATE widgets, widget_updates ' .
'SET widgets.revenue=widget_updates.revenue_new '.
'WHERE widgets.id = widget_updates.widget_id');
echo "Done\n";
}
Don't forget to cleanup temp table on finish.
Upvotes: 1