PatrickPirker
PatrickPirker

Reputation: 359

Mysql calculating streak per user last 60 days

I want to calculate the longest "streak" of every user within 60 days from this mysql table. Streak means there is an entry for the user on this day.

+-----+------------+---------------------+
| id  | user       | date                |
+-----+------------+---------------------+
|   3 | test1      | 2014-06-10 23:55:01 |
|   4 | test2      | 2014-06-10 02:01:06 |
|   5 | test1      | 2014-06-11 23:55:06 |
|   6 | test2      | 2014-06-11 23:55:07 |
|   7 | test1      | 2014-06-12 23:55:07 |
|   9 | test1      | 2014-06-13 23:55:07 |
|   10| test2      | 2014-06-13 23:55:07 |

The output should look like this:

test1  4
test2  2 no entry on  2014-06-12

But I don´t know how to do this correctly.

Upvotes: 2

Views: 515

Answers (1)

spencer7593
spencer7593

Reputation: 108450

One way to do this is to use MySQL user variables. This isn't necessarily the most efficient approach for large sets, since it materializes two inline views.

SELECT s.user
     , MAX(s.streak) AS longest_streak
  FROM ( SELECT IF(@prev_user = o.user AND o.date = @prev_date + INTERVAL 1 DAY
                  , @streak := @streak + 1
                  , @streak := 1
                ) AS streak
              , @prev_user := o.user AS user
              , @prev_date := o.date AS `date`
           FROM ( SELECT t.user
                       , DATE(t.date) AS `date`
                    FROM mytable t
                   CROSS
                    JOIN (SELECT @prev_user := NULL, @prev_date := NULL, @streak := 1) i
                   WHERE t.date >= DATE(NOW()) + INTERVAL -60 DAY
                   GROUP BY t.user, DATE(t.date)
                   ORDER BY t.user, DATE(t.date)
                ) o
       ) s
 GROUP BY s.user

The inline view aliased as i just initializes some user variables; we don't really care what it returns, except that we need it to return exactly 1 row because of the JOIN operation; we just really care about the side effect of initializing user variables early in the statement execution.

The inline view aliased as o gets a list of users and dates; the specification was for an entry "on each date", so we can truncate off the time portion, and get just the DATE, and make that into a distinct set, using the GROUP BY clause.

The inline view aliased as s processes each row, and saves the values of the current row into the @prev_ user variables. Before it overwrites the values, it compares the values on the current row to the values (saved) from the previous row. If the user matches, and the date on the current row is exactly 1 day later than the previous date, we are continuing a "streak", so we increment the current value of the @streak variable by 1. Otherwise, the previous streak was broken, and we start a new "streak", resetting @streak to 1.

Finally, we process the rows from s to extract the maximum streak for each user.

(This statement is desk checked only, there could be a typo or two.)

Upvotes: 4

Related Questions