Reputation: 4168
I have a functionality where user can upload a CSV
file (List), to store/update emails in DB
:
//load CSV
$records = file($request->file('list'), FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
//prepare for mass insert
foreach ($records as $email) {
$updateOrCreate[] = [
'email' => $email
];
}
DB::beginTransaction();
try {
$toKeep = [];
//run a loop to cjeck if email already exists if not create
foreach ($updateOrCreate as $record) {
$email = List::updateOrCreate([
'email' => $record['email']
], $record);
$toKeep[] = $email->id;
}
//delete all records that where no in new CSV
if (count($toKeep)) {
List::whereNotIn("id", $toKeep)->delete();
}
} catch (Exception $e) {
DB::rollBack();
return response()->json([
'message' => $e->getMessage()
], 422);
}
DB::commit();
This solution works fine, up to few thousands of records, if try to upload 50k and more, is slow, I understand why is slow (Queries each records Select / Update or Insert):
foreach ($updateOrCreate as $record) {
$email = List::updateOrCreate([
'email' => $record['email']
], $record);
$toKeep[] = $email->id;
}
For insert I've used a raw pg_copy_from
, works great and fast
$result = pg_copy_from($dbconn, 'lists (email)', $records, ',');
So I was wandering if there is something similar to check if record is not in the list delete it or add if new, with more efficient / faster way, maybe like pg_copy_from
.
Upvotes: 0
Views: 373
Reputation: 3030
From what I can see, you're doing this as an AJAX upload of the CSV, and you're probably having timeouts on the processing. The simple fact is, with this volume of records and the actions you are taking, this is going to take a long time.
I recommend rethinking your approach to something like:
As long as you manage the user expectations, this should work.
Upvotes: 3