Reputation: 119
I have a table in sql server which consists of data like
userID amount startdate enddate game Result
-------------------------------------------------------------------------------
68838 51.00 2014-05-29 15:41:41.167 2014-05-29 15:41:41.167 1 W
68838 51.00 2014-05-29 15:42:30.757 2014-05-29 15:42:30.757 1 W
68838 -0.50 2014-05-31 16:57:31.033 2014-05-31 16:57:31.033 1 L
68838 -0.50 2014-05-31 17:05:31.023 2014-05-31 17:05:31.023 1 L
68838 -0.50 2014-05-31 17:22:03.857 2014-05-31 17:22:03.857 1 L
68838 0.42 2014-05-31 17:26:32.570 2014-05-31 17:26:32.570 1 W
68838 0.42 2014-05-31 17:34:45.330 2014-05-31 17:34:45.330 1 W
68838 0.42 2014-05-31 17:38:44.107 2014-05-31 17:38:44.107 1 W
68838 0.42 2014-05-31 17:42:12.790 2014-05-31 17:42:12.790 1 W
434278 0.42 2014-05-31 16:57:31.033 2014-05-31 16:57:31.033 1 W
434278 0.42 2014-05-31 17:05:31.023 2014-05-31 17:05:31.023 1 W
434278 0.42 2014-05-31 17:22:03.857 2014-05-31 17:22:03.857 1 W
434278 -0.50 2014-05-31 17:26:32.570 2014-05-31 17:26:32.570 1 L
434278 -0.50 2014-05-31 17:34:45.330 2014-05-31 17:34:45.330 1 L
434278 -0.50 2014-05-31 17:38:44.107 2014-05-31 17:38:44.107 1 L
434278 -0.50 2014-05-31 17:42:12.790 2014-05-31 17:42:12.790 1 L
434278 0.42 2014-05-31 17:46:40.723 2014-05-31 17:46:40.723 1 W
434278 -0.50 2014-05-31 17:51:26.190 2014-05-31 17:51:26.190 1 L
434278 0.42 2014-05-31 17:55:32.870 2014-05-31 17:55:32.870 1 W
434278 -4.00 2014-05-31 18:06:54.937 2014-05-31 18:06:54.937 1 L
434278 -2.00 2014-05-31 18:19:29.483 2014-05-31 18:19:29.483 1 L
I want the result to look like this, showing the longest winning streak for each users
UserId StartDate Enddate Streak amount
--------------------------------------------------------------------
68838 2014-05-31 17:26:32:570 2014-05-31 17:42:12:570 4 1.68
434278 2014-05-31 16:57:31:033 2014-05-31 17:22:03:857 3 1.26
Upvotes: 2
Views: 1204
Reputation: 993
Disclaimer: Glenn's answer is a great answer, and does a lot of the heavy lifting for you, but it doesn't go all the way to doing what you asked for. I was going to post an answer I'd been working on, but when I came to add it, saw that Glenn's way of doing the heavy lifting was better than my way of doing it, so I reworked my answer to include his way of doing it. I would urge you to accept his answer, rather than mine.
The below should do exactly what you asked for.
SELECT
Userid,
Min_StartDate as StartDate,
Max_EndDate as EndDate,
max_group_count as Streak,
sum_Amount as Amount
FROM (
SELECT
*,
-- we want the earliest maximum streak
max(Min_StartDate) OVER (PARTITION BY userid) as Earliest_StartDate
FROM (
SELECT
*,
-- we want the maximum streak
max(max_group_count) OVER (PARTITION BY userid) as MAX_Streak
FROM (
SELECT DISTINCT
userid,
-- Calculate this streak
COUNT(grouping) OVER (PARTITION BY userid, grouping
ORDER BY startdate
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING ) as max_group_count
-- Calcualte the start date of this streak
,MIN(StartDate) OVER (PARTITION BY userid, grouping
ORDER BY startdate
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING ) as Min_StartDate
-- Calcualte the end date of this streak
,MAX(EndDate) OVER (PARTITION BY userid, grouping
ORDER BY startdate
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING ) as Max_EndDate
-- Calcualte the total amount
,SUM(Amount) OVER (PARTITION BY userid, grouping
ORDER BY startdate
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING ) as Sum_Amount
FROM ( SELECT *
-- Assign a group number to the streak, so we can group by it
,SUM(CASE WHEN result <> prev_result THEN 1 ELSE 0 END) OVER
(PARTITION BY userid ORDER BY startdate) AS grouping
FROM ( SELECT *
-- We want to look at the previous record to determin when the
-- winning/loosing streak starts and ends
,COALESCE(LAG(result) OVER
(PARTITION BY userid ORDER BY startdate), result) AS prev_result
FROM game
) a
WHERE result = 'W'
) b
) c
) d
WHERE
Max_Group_Count = Max_Streak
) e
WHERE
Min_StartDate = Earliest_StartDate
The output of this is:
| USERID | STARTDATE | ENDDATE | STREAK | AMOUNT |
|--------|----------------------------|----------------------------|--------|--------|
| 68838 | May, 31 2014 17:26:32+0000 | May, 31 2014 17:42:12+0000 | 4 | 1.68 |
| 434278 | May, 31 2014 16:57:31+0000 | May, 31 2014 17:22:03+0000 | 3 | 1.26 |
I've included this as an sql fiddle that you can play with if you fancy: http://sqlfiddle.com/#!6/32777/36/0
Upvotes: 2
Reputation: 3684
In SQLServer 2008 the SUM
has not the possibility to use the OVER(ORDER BY ...)
, that make the query little more complex, but not impossible
;WITH myID AS (
SELECT userID, amount, startdate, enddate, game, Result
, ID = Row_Number() OVER (Partition By userID ORDER BY startdate)
FROM Table1 t1
), SR AS (
SELECT t1.userID, t1.startdate, t1.enddate, t1.Result, t1.amount
, SUM(CASE WHEN t1.Result <> COALESCE(t2.Result, t1.Result)
THEN 1
ELSE 0 END) SC
FROM myID t1
LEFT JOIN myID t2 ON t1.userID = t2.userID AND t1.ID >= t2.ID
GROUP BY t1.userID, t1.startdate, t1.enddate, t1.Result, t1.amount
), SL AS (
SELECT userID, Result, SC, Count(1) Streak
, Row_Number() OVER (PARTITION BY userID ORDER BY Count(1) DESC) Pos
FROM SR
WHERE Result = 'W'
GROUP BY userID, Result, SC
)
SELECT p.userID
, MIN(p.startdate) startdate
, MAX(p.enddate) enddate
, l.Streak
, SUM(p.Amount) Amount
FROM SR p
INNER JOIN SL l ON p.userID = l.userID AND p.SC = l.SC
WHERE l.Pos = 1
GROUP BY p.userID, l.Streak
The myID
CTE
generate an integer ID for the data to simplify the JOIN
condition of the next CTE
, if the data showed in the question is not the entire table and there is already a column with the same effect this CTE
should be removed.
The SR
(StreakRank) CTE
add to the data the Streak Counter SC
, using a triangular JOIN
to generate a rank for every streak, it's not a dense rank, it's just something to use to group by.
The SL
(StreakLength) CTE
get the streak length for every winning streak and create a position ranking by length.
The main query put all together: get the longest streak per usedID
from SL
and JOIN
it to SR
to get the details.
Upvotes: 0
Reputation: 9160
Here are some ideas to get you started. You can play with the inner queries and start working your way out. Basically, it first adds an extra column to each row showing the previous result. Then, every time the previous result differs from the current result, consider it a group switch. Each grouping gets its own number (0 based, and within the context of a user id). Throw away the 'L' groupings. Now you are interested in the grouping with the biggest count for each user.
SELECT *
,COUNT(grouping) OVER (PARTITION BY userid, grouping
ORDER BY startdate
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING ) as max_group_count
FROM ( SELECT *
,SUM(CASE WHEN result <> prev_result THEN 1 ELSE 0 END) OVER
(PARTITION BY userid ORDER BY startdate) AS grouping
FROM ( SELECT *
,COALESCE(LAG(result) OVER
(PARTITION BY userid ORDER BY startdate), result) AS prev_result
FROM game
) x
WHERE result = 'W'
) y
Result:
userid | startdate | enddate | result | prev_result | grouping | max_group_count
--------+-------------------------+-------------------------+--------+-------------+----------+-----------------
68838 | 2014-05-29 15:41:41.167 | 2014-05-29 15:41:41.167 | W | W | 0 | 2
68838 | 2014-05-29 15:42:30.757 | 2014-05-29 15:42:30.757 | W | W | 0 | 2
68838 | 2014-05-31 17:26:32.57 | 2014-05-31 17:26:32.57 | W | L | 1 | 4
68838 | 2014-05-31 17:34:45.33 | 2014-05-31 17:34:45.33 | W | W | 1 | 4
68838 | 2014-05-31 17:38:44.107 | 2014-05-31 17:38:44.107 | W | W | 1 | 4
68838 | 2014-05-31 17:42:12.79 | 2014-05-31 17:42:12.79 | W | W | 1 | 4
434278 | 2014-05-31 16:57:31.033 | 2014-05-31 16:57:31.033 | W | W | 0 | 3
434278 | 2014-05-31 17:05:31.023 | 2014-05-31 17:05:31.023 | W | W | 0 | 3
434278 | 2014-05-31 17:22:03.857 | 2014-05-31 17:22:03.857 | W | W | 0 | 3
434278 | 2014-05-31 17:46:40.723 | 2014-05-31 17:46:40.723 | W | L | 1 | 1
434278 | 2014-05-31 17:55:32.87 | 2014-05-31 17:55:32.87 | W | L | 2 | 1
(11 rows)
Upvotes: 0