nikhil shrma
nikhil shrma

Reputation: 119

How to calculate the longest winning streak with daterange

I have a table in sql server which consists of data like

userID  amount  startdate              enddate        game    Result
-------------------------------------------------------------------------------
68838   51.00   2014-05-29 15:41:41.167 2014-05-29 15:41:41.167 1   W
68838   51.00   2014-05-29 15:42:30.757 2014-05-29 15:42:30.757 1   W
68838   -0.50   2014-05-31 16:57:31.033 2014-05-31 16:57:31.033 1   L
68838   -0.50   2014-05-31 17:05:31.023 2014-05-31 17:05:31.023 1   L
68838   -0.50   2014-05-31 17:22:03.857 2014-05-31 17:22:03.857 1   L
68838   0.42    2014-05-31 17:26:32.570 2014-05-31 17:26:32.570 1   W
68838   0.42    2014-05-31 17:34:45.330 2014-05-31 17:34:45.330 1   W
68838   0.42    2014-05-31 17:38:44.107 2014-05-31 17:38:44.107 1   W
68838   0.42    2014-05-31 17:42:12.790 2014-05-31 17:42:12.790 1   W
434278  0.42    2014-05-31 16:57:31.033 2014-05-31 16:57:31.033 1   W
434278  0.42    2014-05-31 17:05:31.023 2014-05-31 17:05:31.023 1   W
434278  0.42    2014-05-31 17:22:03.857 2014-05-31 17:22:03.857 1   W
434278  -0.50   2014-05-31 17:26:32.570 2014-05-31 17:26:32.570 1   L
434278  -0.50   2014-05-31 17:34:45.330 2014-05-31 17:34:45.330 1   L
434278  -0.50   2014-05-31 17:38:44.107 2014-05-31 17:38:44.107 1   L
434278  -0.50   2014-05-31 17:42:12.790 2014-05-31 17:42:12.790 1   L
434278  0.42    2014-05-31 17:46:40.723 2014-05-31 17:46:40.723 1   W
434278  -0.50   2014-05-31 17:51:26.190 2014-05-31 17:51:26.190 1   L
434278  0.42    2014-05-31 17:55:32.870 2014-05-31 17:55:32.870 1   W
434278  -4.00   2014-05-31 18:06:54.937 2014-05-31 18:06:54.937 1   L
434278  -2.00   2014-05-31 18:19:29.483 2014-05-31 18:19:29.483 1   L

I want the result to look like this, showing the longest winning streak for each users

UserId StartDate                  Enddate                    Streak  amount
--------------------------------------------------------------------
68838  2014-05-31 17:26:32:570    2014-05-31 17:42:12:570     4       1.68
434278  2014-05-31 16:57:31:033   2014-05-31 17:22:03:857     3       1.26

Upvotes: 2

Views: 1204

Answers (3)

stormCloud
stormCloud

Reputation: 993

Disclaimer: Glenn's answer is a great answer, and does a lot of the heavy lifting for you, but it doesn't go all the way to doing what you asked for. I was going to post an answer I'd been working on, but when I came to add it, saw that Glenn's way of doing the heavy lifting was better than my way of doing it, so I reworked my answer to include his way of doing it. I would urge you to accept his answer, rather than mine.

The below should do exactly what you asked for.

SELECT
    Userid,
    Min_StartDate as StartDate,
    Max_EndDate as EndDate,
    max_group_count as Streak,
    sum_Amount as Amount

FROM (
    SELECT
        *,
        -- we want the earliest maximum streak
        max(Min_StartDate) OVER (PARTITION BY userid) as Earliest_StartDate

    FROM (
        SELECT
            *,
            -- we want the maximum streak
            max(max_group_count) OVER (PARTITION BY userid) as MAX_Streak
        FROM (    
            SELECT DISTINCT
                  userid,
                  -- Calculate this streak
                  COUNT(grouping) OVER (PARTITION BY userid, grouping
                             ORDER BY startdate
                             ROWS BETWEEN UNBOUNDED PRECEDING
                                  AND UNBOUNDED FOLLOWING ) as max_group_count
                  -- Calcualte the start date of this streak
                  ,MIN(StartDate) OVER (PARTITION BY userid, grouping
                             ORDER BY startdate
                             ROWS BETWEEN UNBOUNDED PRECEDING
                                  AND UNBOUNDED FOLLOWING ) as Min_StartDate
                  -- Calcualte the end date of this streak
                  ,MAX(EndDate) OVER (PARTITION BY userid, grouping
                             ORDER BY startdate
                             ROWS BETWEEN UNBOUNDED PRECEDING
                                  AND UNBOUNDED FOLLOWING ) as Max_EndDate
                  -- Calcualte the total amount
                  ,SUM(Amount) OVER (PARTITION BY userid, grouping
                             ORDER BY startdate
                             ROWS BETWEEN UNBOUNDED PRECEDING
                                  AND UNBOUNDED FOLLOWING ) as Sum_Amount         

              FROM ( SELECT *
                        -- Assign a group number to the streak, so we can group by it
                       ,SUM(CASE WHEN result <> prev_result THEN 1 ELSE 0 END) OVER
                       (PARTITION BY userid ORDER BY startdate) AS grouping

                   FROM ( SELECT *
                         -- We want to look at the previous record to determin when the 
                         -- winning/loosing streak starts and ends
                        ,COALESCE(LAG(result) OVER
                            (PARTITION BY userid ORDER BY startdate), result) AS prev_result
                        FROM game
                    ) a

                   WHERE result = 'W'

                   ) b
             ) c
      ) d 
  WHERE
      Max_Group_Count = Max_Streak

) e
WHERE
  Min_StartDate = Earliest_StartDate

The output of this is:

| USERID |                  STARTDATE |                    ENDDATE | STREAK | AMOUNT |
|--------|----------------------------|----------------------------|--------|--------|
|  68838 | May, 31 2014 17:26:32+0000 | May, 31 2014 17:42:12+0000 |      4 |   1.68 |
| 434278 | May, 31 2014 16:57:31+0000 | May, 31 2014 17:22:03+0000 |      3 |   1.26 |

I've included this as an sql fiddle that you can play with if you fancy: http://sqlfiddle.com/#!6/32777/36/0

Upvotes: 2

Serpiton
Serpiton

Reputation: 3684

In SQLServer 2008 the SUM has not the possibility to use the OVER(ORDER BY ...), that make the query little more complex, but not impossible

;WITH myID AS (
  SELECT userID, amount, startdate, enddate, game, Result
       , ID = Row_Number() OVER (Partition By userID ORDER BY startdate)
  FROM   Table1 t1
), SR AS (
  SELECT t1.userID, t1.startdate, t1.enddate, t1.Result, t1.amount
       , SUM(CASE WHEN t1.Result <> COALESCE(t2.Result, t1.Result) 
                  THEN 1 
                  ELSE 0 END) SC
  FROM   myID t1
         LEFT JOIN myID t2 ON t1.userID = t2.userID AND t1.ID >= t2.ID
  GROUP BY t1.userID, t1.startdate, t1.enddate, t1.Result, t1.amount
), SL AS (
  SELECT userID, Result, SC, Count(1) Streak
       , Row_Number() OVER (PARTITION BY userID ORDER BY Count(1) DESC) Pos
  FROM   SR
  WHERE  Result = 'W'
  GROUP BY userID, Result, SC
)
SELECT p.userID
     , MIN(p.startdate) startdate
     , MAX(p.enddate) enddate
     , l.Streak
     , SUM(p.Amount) Amount
FROM   SR p
       INNER JOIN SL l ON p.userID = l.userID AND p.SC = l.SC
WHERE  l.Pos = 1
GROUP BY p.userID, l.Streak

SQLFiddle demo

The myID CTE generate an integer ID for the data to simplify the JOIN condition of the next CTE, if the data showed in the question is not the entire table and there is already a column with the same effect this CTE should be removed.
The SR (StreakRank) CTE add to the data the Streak Counter SC, using a triangular JOIN to generate a rank for every streak, it's not a dense rank, it's just something to use to group by.
The SL (StreakLength) CTE get the streak length for every winning streak and create a position ranking by length.
The main query put all together: get the longest streak per usedID from SL and JOIN it to SR to get the details.

Upvotes: 0

Glenn
Glenn

Reputation: 9160

Here are some ideas to get you started. You can play with the inner queries and start working your way out. Basically, it first adds an extra column to each row showing the previous result. Then, every time the previous result differs from the current result, consider it a group switch. Each grouping gets its own number (0 based, and within the context of a user id). Throw away the 'L' groupings. Now you are interested in the grouping with the biggest count for each user.

SELECT *
      ,COUNT(grouping) OVER (PARTITION BY userid, grouping
                 ORDER BY startdate
                 ROWS BETWEEN UNBOUNDED PRECEDING
                      AND UNBOUNDED FOLLOWING ) as max_group_count

  FROM ( SELECT *
           ,SUM(CASE WHEN result <> prev_result THEN 1 ELSE 0 END) OVER
           (PARTITION BY userid ORDER BY startdate) AS grouping

       FROM ( SELECT *
            ,COALESCE(LAG(result) OVER
                (PARTITION BY userid ORDER BY startdate), result) AS prev_result
            FROM game
        ) x

       WHERE result = 'W'

       ) y

Result:

 userid |        startdate        |         enddate         | result | prev_result | grouping | max_group_count
--------+-------------------------+-------------------------+--------+-------------+----------+-----------------
  68838 | 2014-05-29 15:41:41.167 | 2014-05-29 15:41:41.167 | W      | W           |        0 |               2
  68838 | 2014-05-29 15:42:30.757 | 2014-05-29 15:42:30.757 | W      | W           |        0 |               2
  68838 | 2014-05-31 17:26:32.57  | 2014-05-31 17:26:32.57  | W      | L           |        1 |               4
  68838 | 2014-05-31 17:34:45.33  | 2014-05-31 17:34:45.33  | W      | W           |        1 |               4
  68838 | 2014-05-31 17:38:44.107 | 2014-05-31 17:38:44.107 | W      | W           |        1 |               4
  68838 | 2014-05-31 17:42:12.79  | 2014-05-31 17:42:12.79  | W      | W           |        1 |               4
 434278 | 2014-05-31 16:57:31.033 | 2014-05-31 16:57:31.033 | W      | W           |        0 |               3
 434278 | 2014-05-31 17:05:31.023 | 2014-05-31 17:05:31.023 | W      | W           |        0 |               3
 434278 | 2014-05-31 17:22:03.857 | 2014-05-31 17:22:03.857 | W      | W           |        0 |               3
 434278 | 2014-05-31 17:46:40.723 | 2014-05-31 17:46:40.723 | W      | L           |        1 |               1
 434278 | 2014-05-31 17:55:32.87  | 2014-05-31 17:55:32.87  | W      | L           |        2 |               1
(11 rows)

Upvotes: 0

Related Questions