rfho_bdss
rfho_bdss

Reputation: 170

SQL count users with winning streaks

I am fairly new to SQL and have been messing about with some small projects to get to grips with how it can be used for large scale statistics.

The problem I am currently working on is just counting the number of users who have had a winning streak in a time period.

Given a table of the format

user_id,date_time,user_team,win_team
ab542a,2018-01-02 18:45:25,team1,team1
ef72da,2018-01-02 08:20:01,team2,team1
f5c776,2017-12-30 15:25:25,team1,team2
5a278a,2018-01-01 14:27:15,team2,team2
ae346d,2018-01-01 14:27:15,team2,team2
2b13d8,2017-12-31 12:33:34,team1,team2
ace797,2018-01-02 08:20:01,team2,team2
ace797,2018-01-03 18:18:22,team1,team2
ab542a,2018-01-03 18:45:25,team1,team1
ef72da,2018-01-03 08:20:01,team2,team1
f5c776,2017-12-31 15:25:25,team1,team2
5a278a,2018-01-02 14:27:15,team2,team2
ae346d,2018-01-02 14:27:15,team2,team2
2b13d8,2018-01-01 12:33:34,team1,team2
ace797,2018-01-03 08:20:01,team1,team1
ace797,2018-01-04 18:18:22,team1,team1
ab542a,2018-01-04 18:45:25,team1,team1
ef72da,2018-01-04 08:20:01,team2,team1
f5c776,2018-01-01 15:25:25,team1,team2
5a278a,2018-01-03 14:27:15,team2,team2
ae346d,2018-01-03 14:27:15,team2,team2
2b13d8,2018-01-02 12:33:34,team1,team2
ace797,2018-01-04 08:20:01,team2,team2
ace797,2018-01-05 18:18:22,team1,team1

where user is the users id, date is the date of the match, team is the side the user played for and winner is the winning side from the match. How could I count all the users that have had a winning streak (at least 3 consecutive wins)?

Furthermore, say I want to also keep track of the game being played (chess, backgammon, etc.) in the same table, would it be possible in the same query to track streaks in multiple games?

In python this could be achieved with a relatively simple loop over user ids but would be computationally expensive and probably wouldn't scale well

Upvotes: 0

Views: 709

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270391

If you want users with at least 3 wins in a row, you can use window functions like this:

select count(distinct t.user)
from (select t.*,
             lead(date, 2) over (partition by user order by date) as date_2,
             lead(date, 2) over (partition by user, (case when team = winner then 'win' else 'lose' end
                                 order by date
                                ) date_same_2
      from t
      where date >= ? and date < ?
     ) t
where team = winner;

What this is doing is checking the row for the user 2 rows ahead by two criteria. The first is just by date. The second is when the user's team is the winner. If these are the same -- and the current row is a winning row -- then you have three wins in a row.

Upvotes: 1

Related Questions