Reputation: 1981
My task is: "To take transactions table, grouped row by transaction date and calculate statuses. This manipulations will be formed statistics, wich will be rendered on the page".
This is my method of this statistics generation
public static function getStatistics(Website $website = null)
{
if($website == null) return [];
$query = \DB::table('transactions')->where("website_id", $website->id)->orderBy("dt", "desc")->get();
$transitions = collect(static::convertDate($query))->groupBy("dt");
$statistics = collect();
dd($transitions);
foreach ($transitions as $date => $trans) {
$subscriptions = $trans->where("status", 'subscribe')->count();
$unsubscriptions = $trans->where("status", 'unsubscribe')->count();
$prolongations = $trans->where("status", 'rebilling')->count();
$redirections = $trans->where("status", 'redirect_to_lp')->count();
$conversion = $redirections == 0 ? 0 : ((float) ($subscriptions / $redirections));
$earnings = $trans->sum("pay");
$statistics->push((object)[
"date" => $date,
"subscriptions" => $subscriptions,
'unsubscriptions' => $unsubscriptions,
'prolongations' => $prolongations,
'redirections' => $redirections,
'conversion' => round($conversion, 2),
'earnings' => $earnings,
]);
}
return $statistics;
}
if count of transaction rows below 100,000 - it's all wright. But, if count is above 150-200k - nginx throw 502 bad gateway. What can you advise to me? I'm don't have any expierince in bigdata handling. May be, my impiments has fundamental error?
Upvotes: 1
Views: 12042
Reputation: 1111
The best way to handle big/huge database data/array with Laravel is to use LazyCollection
(docs) with chunks
. Also discussed here
<?php
$chunkSize = 500;
MyModel::lazy($chunkSize)->each(function (LazyCollection $items) {
$items->each(function(MyModel $item) {
// $item handling
});
});
// or collect($bigDataArray)->lazy()->chunk($chunkSize)->each(function(LazyCollection $items) {...})
Upvotes: 0
Reputation: 1981
After several days of researching information on this question, I found the right answer:
NOT to use PHP for handling raw data. It's better to use SQL!
In my case, we are using PostgreSQL.
Below, i'll write sql-query which worked for me, maybe it will help someone else.
WITH
cte_range(dt) AS
(
SELECT
generate_series('2016-04-01 00:00:00'::timestamp with time zone, '{$date} 00:00:00'::timestamp with time zone, INTERVAL '1 day')
),
cte_data AS
(
SELECT
date_trunc('day', dt) AS dt,
COUNT(*) FILTER (WHERE status = 'subscribe') AS count_subscribes,
COUNT(*) FILTER (WHERE status = 'unsubscribe') AS count_unsubscribes,
COUNT(*) FILTER (WHERE status = 'rebilling') AS count_rebillings,
COUNT(*) FILTER (WHERE status = 'redirect_to_lp') AS count_redirects_to_lp,
SUM(pay) AS earnings,
CASE
WHEN COUNT(*) FILTER (WHERE status = 'redirect_to_lp') > 0 THEN 100.0 * COUNT(*) FILTER (WHERE status = 'subscribe')::float / COUNT(*) FILTER (WHERE status = 'redirect_to_lp')::float
ELSE 0
END
AS conversion_percent
FROM
transactions
WHERE
website_id = {$website->id}
GROUP BY
date_trunc('day', dt)
)
SELECT
to_char(cte_range.dt, 'YYYY-MM-DD') AS day,
COALESCE(cte_data.count_subscribes, 0) AS count_subscribe,
COALESCE(cte_data.count_unsubscribes, 0) AS count_unsubscribes,
COALESCE(cte_data.count_rebillings, 0) AS count_rebillings,
COALESCE(cte_data.count_redirects_to_lp, 0) AS count_redirects_to_lp,
COALESCE(cte_data.conversion_percent, 0) AS conversion_percent,
COALESCE(cte_data.earnings, 0) AS earnings
FROM
cte_range
LEFT JOIN
cte_data
ON cte_data.dt = cte_range.dt
ORDER BY
cte_range.dt DESC
Upvotes: 4
Reputation: 3288
Big data is never easy, but I would suggest using the Laravel chunk
instead of get
.
https://laravel.com/docs/5.1/eloquent (ctrl+f "::chunk")
What ::chunk
does is select n rows at a time, and allow you to process them bit by bit. This is convenient in that it allows you to stream updates to the browser, but at the ~150k result range, I would suggest looking up how to push this work into a background process instead of handling it on request.
Upvotes: 4