maxTrialfire
maxTrialfire

Reputation: 551

Finding the rows leading up to an event and treating them as a sequence

This feels like a gaps and islands problem and window functions seem like the right tool but I can't seem to get a working result.

My data looks like this (Note: GameID is a UUID and shown here as a counter for readability):

  GameID   User   Date      Win  

  100       A    10/11/2012   0
  101       A    10/12/2012   0
  102       B    10/11/2012   0
  103       B    10/13/2012   1
  104       B    10/14/2012   0
  105       C    10/10/2012   0
  106       C    10/12/2012   0
  107       C    10/13/2012   1
  108       C    10/14/2012   0
  109       C    10/15/2012   0
  110       C    10/16/2012   0
  111       C    10/17/2012   1
  112       D    10/11/2012   0
  113       D    10/13/2012   1
  114       D    10/20/2012   0
  115       D    10/21/2012   0

I'm looking to capture (count/aggregate into array) the sequence of losses a user has before each win. So for example if you look at user B they have one win, that win has one loss before it. If you look at user C, the win with ID 107 is preceded by two losses and the win with ID 111 is preceded by 3 loses.

I'm looking to apply array aggregation array_agg and attach the preceding loses to a win. Ultimately I want the following result:

  GameID   User   Date      Win  LosingStreak


  103       B    10/13/2012   1    [102]
  107       C    10/13/2012   1    [105, 106]
  111       C    10/17/2012   1    [110,109,108]
  113       D    10/13/2012   1    [112]

I've been playing with partition by User order by date but I need to "reset" each partition at a win and I can't seem to wrap my head around any way of solving it. lead() and lag() also are of no help because I need the lead(x) where x is a variable number for each partition.

Upvotes: 0

Views: 55

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269823

Hmmm. You can identify the groups using a reverse sum of the wins. Then, just do the aggregation:

select t.user, max(t.date) as date,
       max(case when win = 1 then gameid end) as gameid,
       array_agg(gameid order by date asc) filter (where win = 0) as gameid_losses
from (select t.*,
             sum(wins) over (partition by user order by date desc) as grp
      from t
     ) t
group by user, grp;

Upvotes: 1

Related Questions