Reputation: 2829
So we have an existing system, which we are trying to scale up and running out of memory retrieving close to 3M records.
I was trying to determine how viable is increasing server memory as a stop gap solution, by ascertaining data size returned by the query, by doing something like:
select sum(row_size)
from (
SELECT
ifnull(LENGTH(qr.id), 0)+
ifnull(LENGTH(qr.question_id), 0)+
ifnull(LENGTH(qr.form_response_id), 0)+
ifnull(LENGTH(qr.`value`), 0)+
ifnull(LENGTH(qr.deleted_at), 0)+
ifnull(LENGTH(qr.created_at), 0)+
ifnull(LENGTH(qr.updated_at), 0)
as row_size
FROM
....
LIMIT 500000
) as tbl1;
Which returns 30512865
which is roughly 30MB of data.
However when I cross check what PHP actually uses to store the results using:
$memBefore = memory_get_usage();
$formResponses = DB::select($responsesSQL, $questionIDsForSQL);
$memAfter = memory_get_usage();
dd($memBefore, $memAfter);
I am getting 315377552
and 22403248
which means 292974304
bytes or roughly 300MB of memory usage to store simple array!
I would like to understand why the memory footprint is 10 times the data retrieved, and is there anything I could do to reduce that footprint, short of modifying the API response from back end, and front end to not need the entire result set which will take time.
For context, current implementation uses the above results (returned by getQuestionResponses
)to transform them into associative array grouped by question_id using Laravel Collections:
collect($this->questionResponseRepo->getQuestionResponses($questions))->groupBy('question_id')->toArray();
I am thinking to replace the collect
with own implementation more memory efficient which will use the array returned from the query to reduce memory inflation by converting that array into Laravel's Collection, but thats still not helping with the array itself taking 300MB for 500k records responses instead of 30MB.
One of the solutions online is to use SplFixedArray
but I am not sure how to force DB::select
to use that instead of array?
Another possible solution involves ensuring it returns simple assoc array instead of array of standard classes https://stackoverflow.com/a/37532052/373091 But when I try that as in:
// get original model
$fetchMode = DB::getFetchMode();
// set mode to custom
DB::setFetchMode(\PDO::FETCH_ASSOC);
$memBefore = memory_get_usage();
$formResponses = DB::select($responsesSQL, $questionIDsForSQL);
DB::setFetchMode($fetchMode);
$memAfter = memory_get_usage();
dd($memBefore, $memAfter, $formResponses);
, I get error Call to undefined method Illuminate\\Database\\MySqlConnection::getFetchMode()
which means apparently it can no longer be done from Laravel> 5.4 :(
Any suggestions?
Upvotes: 3
Views: 367
Reputation: 280
I think the real problem is that you're loading all 3 million records into memory at once. You should instead either process them in chunks or use a cursor.
To chunk records into batches, you can use the Laravel's chunk method. This method accepts two parameters, the chunk size and a callback that gets passed the subset of models or objects for processing. This will execute on query per chunk.
Here's the example taken from the documentation:
Flight::chunk(200, function ($flights) {
foreach ($flights as $flight) {
//
}
});
Alternatively, you can also use the cursor method if you only want to execute a single query. In this case, Laravel will only hydrate one model at a time so you never have more than one model (or object if you're not using Eloquent) in memory at a time.
foreach (Flight::where('destination', 'Zurich')->cursor() as $flight) {
//
}
Upvotes: 1