Centell
Centell

Reputation: 409

Cumulative Sum in MySQL

Using MySQL. I want to get cumulative sum.

This is my table

CREATE TABLE `user_infos`
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
(..)
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`) )

And what I want to get is

+-------+-------+----------------+
| month | count | cumulative_sum |
+-------+-------+----------------+
|   01  |  100  | 100            |
|   02  |  101  | 201            |
|   ... |  110  | 311            |
|   12  |  200  | 511            |
+-------+-------+----------------+

but the result is

+-------+-------+----------------+
| month | count | cumulative_sum |
+-------+-------+----------------+
|   01  |  100  | 100            |
|   02  |  101  | 101            |
|   ... |  110  | 110            |
|   12  |  200  | 200            |
+-------+-------+----------------+

This is my wrong query..

select 
     T1.Month,T1.Count,
     @runnung_total := (@running_total + T1.Count) as cumulative_sum
from (
     select date_format(created_at,'%m') as Month,count(1) as Count from users 
          where date_format(created_at,'%Y')='2016'
          group by(date_format(created_at,'%m'))
     union
     select date_format(created_at,'%m') as Month,count(1) as Count from users 
          where date_format(created_at,'%Y')='2017'
          group by(date_format(created_at,'%m')) ) as T1
join (select @running_total := 0) as R1;

I referred to this. What's wrong in my code?

Upvotes: 1

Views: 2884

Answers (3)

Lukas Eder
Lukas Eder

Reputation: 221350

Starting with MySQL 8, the ideal approach to calculate cumulative sums is by using SQL standard window functions rather than the vendor-specific, and not stricly declarative approach of using local variables. Your query can be written as follows:

WITH data(month, count) AS (
  SELECT date_format(create_at, '%m') AS month, count(*) AS count
  FROM users
  GROUP BY date_format(create_at, '%m')
)
SELECT
  month,
  count,
  sum(count) OVER (ORDER BY month) AS cumulative_sum
FROM data

Upvotes: 0

Stefano Zanini
Stefano Zanini

Reputation: 5926

You can achieve that in two steps: first of all get the sum for each year and month

select  concat(year(created_at), lpad(month(created_at), 2, '0')) as ye_mo,
        count(*) as cnt
from    users
group by concat(year(created_at), lpad(month(created_at), 2, '0'))

Then join it with itself, having each row matched with all previous ones

select  t1.ye_mo, sum(t2.cnt)
from    (
            select  concat(year(created_at), lpad(month(created_at), 2, '0')) as ye_mo,
                    count(*) as cnt
            from    users
            group by concat(year(created_at), lpad(month(created_at), 2, '0'))
        ) t1
join    (
            select  concat(year(created_at), lpad(month(created_at), 2, '0')) as ye_mo,
                    count(*) as cnt
            from    users
            group by concat(year(created_at), lpad(month(created_at), 2, '0'))
        ) t2
on      t1.ye_mo >= t2.ye_mo
group by t1.ye_mo
order by t1.ye_mo

Edit

The query above assumes you want the running sum to increase across different years. If you want to display the months only, and aggregate the values of different years in the same month, you can change id this way

select  t1.mnt, sum(t2.cnt)
from    (
            select  month(created_at) as mnt,
                    count(*) as cnt
            from    userss
            group by month(created_at)
        ) t1
join    (
            select  month(created_at) as mnt,
                    count(*) as cnt
            from    userss
            group by month(created_at)
        ) t2
on      t1.mnt >= t2.mnt
group by t1.mnt
order by t1.mnt

Finally, if you want the running sum to reset at the beginning of each year, you can do that like this

select  t1.yr, t1.mn, sum(t2.cnt)
from    (
            select  year(created_at) as yr, month(created_at) as mn,
                    count(*) as cnt
            from    userss
            group by year(created_at), month(created_at)
        ) t1
join    (
            select  year(created_at) as yr, month(created_at) as mn,
                    count(*) as cnt
            from    userss
            group by year(created_at), month(created_at)
        ) t2
on      t1.yr = t2.yr and
        t1.mn >= t2.mn
group by t1.yr, t1.mn
order by t1.yr, t1.mn

All three versions can be seen in action here

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1271111

Variables are the right way to go. You can simplify your query:

select m.Month, m.cnt,
     (@running_total := (@running_total + m.cnt) ) as cumulative_sum
from (select month(created_at) as Month, count(*) as cnt
      from users 
      where year(created_at) in (2016, 2017)
      group by month(created_at)
     ) m cross join
     (select @running_total := 0) params
order by m.Month;

Upvotes: 2

Related Questions