sil
sil

Reputation: 2110

Cumulative count(*) in MySQL

Imagine a SQL table votes, with two columns vote and time; vote is a string, time is a timestamp. I'd like to retrieve a "cumulative count" from the table. So, the table looks like this:

vote       | time
-----------+----------------
stuart     | 1443000000
stuart     | 1443000000
bryan      | 1443000000
stuart     | 1443000001
bryan      | 1443000001
bryan      | 1443000002
bryan      | 1443000002

so, "stuart" gets two votes in the first time interval and "bryan" gets one; they get one each in the second interval, and bryan gets two more in the third, and thus the output results I would like are these:

vote       | time         | cumulative total
-----------+--------------+-----------------
stuart     | 1443000000   | 2
bryan      | 1443000000   | 1
stuart     | 1443000001   | 3
bryan      | 1443000001   | 2
stuart     | 1443000002   | 3
bryan      | 1443000002   | 4

I'm not sure how to do this. Obviously, getting how many votes happened in each time interval is trivial (SELECT vote, time, COUNT(*) from table GROUP BY time, vote ORDER BY time, vote) but making a cumulative count seems harder. Is there a way to do so?

(I'm using MySQL, and am happy to use MySQL-specific functions here if useful.)

Upvotes: 5

Views: 4954

Answers (3)

Lukas Eder
Lukas Eder

Reputation: 220797

MySQL 8 introduced support for window functions, which are the perfect tool for cumulative sums. They will perform much better than the alternatives using correlated subqueries. Your desired query can be written as follows:

SELECT
  vote,
  time,
  sum(count(*)) OVER (PARTITION BY vote ORDER BY time)
FROM table
GROUP BY time, vote
ORDER BY time, vote

Note that the inner count(*) is an ordinary aggregate function, which is nested in the outer sum(...) OVER (...) window function. This is possible because aggregate functions are logically calculated before window functions.

Upvotes: 3

Jerry Jeremiah
Jerry Jeremiah

Reputation: 9618

Ok, if you really need that stuart | 1443000002 | 3 row then you can do this:

select distinct vote, time,
       (select count(*) from `table` t2
        where t2.vote = t1.vote
        and t2.time <= t1.time) `cumulative total`
from   (select distinct t3.vote vote, t4.time time
        from `table` t3, `table` t4) t1;

Here is the SQL Fiddle: http://sqlfiddle.com/#!9/960d6/11/0

What we need to do is make a table that contains EVERY combination of names and times (a join with no conditions) and then calculate the counts from the original table using all those combinations.

Thanks to @jarlh for the query in his answer - not having to start from scratch saved a bunch of time.

Upvotes: 0

jarlh
jarlh

Reputation: 44766

Using a correlated sub-query to do the counting:

select distinct vote, time, (select count(*) from table t2
                             where t2.vote = t1.vote
                               and t2.time <= t1.time)
from table t1

Upvotes: 6

Related Questions