untitled new
untitled new

Reputation: 13

MYSQL SUM until last day of Each month for last 12 months

I have a table like this two

Table A

  date          amount     B_id 
'2020-1-01'     3000000      1
'2019-8-01'     15012        1
'2019-6-21'     90909        1
'2020-1-15'     84562        1
--------

Table B

id       type
1         7
2         5

I have to show sum of amount until the last date of each month for the last 12 month. The query i have prepared is like this..

SELECT num2.last_dates,
  (SELECT SUM(amount) FROM A
      INNER JOIN B ON A.B_id = B.id
      WHERE B.type = 7 AND A.date<=num2.last_dates
     ),
    (SELECT SUM(amount) FROM A
      INNER JOIN B ON A.B_id = B.id
      WHERE B.type = 5 AND A.date<=num2.last_dates)
FROM
  (SELECT last_dates
      FROM (
  SELECT LAST_DAY(CURDATE()  - INTERVAL CUSTOM_MONTH MONTH) last_dates
      FROM(
          SELECT 1 CUSTOM_MONTH UNION
            SELECT 0 UNION
            SELECT 2 UNION
            SELECT 3 UNION
            SELECT 4 UNION
            SELECT 5 UNION
            SELECT 6 UNION
            SELECT 7 UNION
            SELECT 8 UNION
            SELECT 9 UNION
            SELECT 10 UNION
          SELECT 11 UNION
            SELECT 12 )num
) num1
  )num2

ORDER BY num2.last_dates

This gives me the result like this which is exactly how i need it. I need this query to execute faster. Is there any better way to do what i am trying to do?

2019-05-31  33488.69        109.127800
2019-06-30  263.690          1248932.227800
2019-07-31  274.690         131.827800
2019-08-31  627.690         13.687800
2019-09-30  1533.370000     08.347800
2019-10-31  1444.370000     01.327800
2019-11-30  5448.370000     247.227800
2019-12-31  61971.370000    016.990450
2020-01-31  19550.370000    2535.185450
2020-02-29  986.370000      405.123300
2020-03-31  1152.370000     26.793300
2020-04-30  9404.370000     11894.683300
2020-05-31  3404.370000     17894.683300



Upvotes: 0

Views: 898

Answers (3)

Geoduck
Geoduck

Reputation: 9005

You are getting some complicated answers. I think it is easier. Start with knowing we can easily sum for each month:

SELECT SUM(amount) as monthtotal,
        type,
        MONTH(date) as month,
        YEAR(date) as year 
FROM A LEFT JOIN B on A.B_id=B.id 
GROUP BY type,month,year

From that data, we can use a variable to get running total. Best to do by initializing the variable, but not necessary. We can get the data necessary like this

SET @running := 0;

SELECT (@running := @running + monthtotal) as running, type, LAST_DAY(CONCAT(year,'-',month,'-',1))
FROM 
(SELECT SUM(amount) as monthtotal,type,MONTH(date) as month,YEAR(date) as year FROM A LEFT JOIN B on A.B_id=B.id GROUP BY type,month,year) AS totals
ORDER BY year,month

You really need to have a connector that supports multiple statements, or make multiple calls to initialize the variable. Although you can null check the variable and default to 0, you still have an issue if you run the query a second time.

Last thing, if you really want the types to be summed separately:

SET @running5 := 0;
SET @running7 := 0;

SELECT 
    LAST_DAY(CONCAT(year,'-',month,'-',1)),
    (@running5 := @running5 + (CASE WHEN type=5 THEN monthtotal ELSE 0 END)) as running5, 
    (@running7 := @running7 + (CASE WHEN type=7 THEN monthtotal ELSE 0 END)) as running7
FROM 
(SELECT SUM(amount) as monthtotal,type,MONTH(date) as month,YEAR(date) as year FROM A LEFT JOIN B on A.B_id=B.id GROUP BY type,month,year) AS totals
ORDER BY year,month

We still don't show months where there is no data. I'm not sure that is a requirement. But this should only need one pass of table A.

Also, make sure the id on table B is indexed.

Upvotes: 0

spencer7593
spencer7593

Reputation: 108400

I'd use conditional aggregation, and pre-aggregate the monthly totals in one pass, instead of doing twenty-six individual passes repeatedly through the same data.

I'd start with something like this:

SELECT CASE WHEN A.date < DATE(NOW()) + INTERVAL -14 MONTH
       THEN LAST_DAY(     DATE(NOW()) + INTERVAL -14 MONTH )
       ELSE LAST_DAY( A.date )
       END                                    AS _month_end
     , SUM(IF( B.type = 5 , B.amount , NULL)) AS tot_type_5
     , SUM(IF( B.type = 7 , B.amount , NULL)) AS tot_type_7
  FROM A
  JOIN B
    ON B.id = A.B_id
 WHERE B.type IN (5,7)
 GROUP
    BY _month_end

(column amount isn't qualified in original query, so just guessing here which table that is from. adjust as necessary. best practice is to qualify all column references.

That gets us the subtotals for each month, in a single pass through A and B.

We can get that query tested and tuned.

Then we can incorporate that as an inline view in an outer query which adds up those monthly totals. (I'd do an outer join, just in case rows are missing, sow we don't wind up omitting rows.)

Something like this:

SELECT d.dt + INTERVAL -i.n MONTH + INTERVAL -1 DAY  AS last_date
     , SUM(IFNULL(t.tot_type_5,0))                   AS rt_type_5
     , SUM(IFNULL(t.tot_type_7,0))                   AS rt_type_7
  FROM ( -- first day of next month
         SELECT DATE(NOW()) + INTERVAL -DAY(DATE(NOW()))+1 DAY + INTERVAL 1 MONTH AS dt
       ) d
 CROSS
  JOIN ( -- thirteen integers, integers 0 thru 12
         SELECT 0 AS n
         UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
         UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8
         UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12
       ) i

 LEFT
 JOIN ( -- totals by month
        SELECT CASE WHEN A.date < DATE(NOW()) + INTERVAL -14 MONTH
               THEN LAST_DAY(     DATE(NOW()) + INTERVAL -14 MONTH )
               ELSE LAST_DAY( A.date )
               END                                    AS _month_end
             , SUM(IF( B.type = 5 , B.amount , NULL)) AS tot_type_5
             , SUM(IF( B.type = 7 , B.amount , NULL)) AS tot_type_7
          FROM A
          JOIN B
            ON B.id = A.B_id
         WHERE B.type IN (5,7)
         GROUP
            BY _month_end
      ) t
   ON t._month_end < d.dt

GROUP BY d.dt + INTERVAL -i.n MONTH + INTERVAL -1 DAY
ORDER BY d.dt + INTERVAL -i.n MONTH + INTERVAL -1 DAY DESC

The design is meant to do one swoop through the A JOIN B set. We're expecting to get about 14 rows back. And we're doing a semi-join, duplicating the oldest months multiple times, so approx . 14 x 13 / 2 = 91 rows, that get collapsed into 13 rows.

The big rock in terms of performance is going to be materializing that inline view query.

Upvotes: 2

Thorsten Kettner
Thorsten Kettner

Reputation: 94914

This is how I'd probably approach this in MySQL 8 with SUM OVER:

  1. Get the last 12 months.
  2. Use these months to add empty month rows to the original data, as MySQL doesn't support full outer joins.
  3. Get the running totals for all months.
  4. Show only the last twelve months.

The query:

with months (date) as
(
  select last_day(current_date - interval  1 month) union all
  select last_day(current_date - interval  2 month) union all
  select last_day(current_date - interval  3 month) union all
  select last_day(current_date - interval  4 month) union all
  select last_day(current_date - interval  5 month) union all
  select last_day(current_date - interval  6 month) union all
  select last_day(current_date - interval  7 month) union all
  select last_day(current_date - interval  8 month) union all
  select last_day(current_date - interval  9 month) union all
  select last_day(current_date - interval 10 month) union all
  select last_day(current_date - interval 11 month) union all
  select last_day(current_date - interval 12 month)
)
, data (date, amount, type) as
(
  select last_day(a.date), a.amount, b.type
  from a
  join b on b.id = a.b_id
  where b.type in (5, 7)
  union all
  select date, null, null from months
)
select
  date,
  sum(sum(case when type = 5 then amount end)) over (order by date) as t5,
  sum(sum(case when type = 7 then amount end)) over (order by date) as t7
from data
group by date
order by date
limit 12;

Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ddeb3ab3e086bfc182f0503615fba74b

I don't know whether this is faster than your own query or not. Just give it a try. (You'd get my query much faster by adding a generated column for last_day(date) to your table and use this. If you need this often, this may be an option.)

Upvotes: 0

Related Questions