Alexander Farber
Alexander Farber

Reputation: 23058

Calculating average time interval length

I have prepared a simple SQL Fiddle demonstrating my problem -

In PostgreSQL 10.3 I store user information, two-player games and the moves in the following 3 tables:

CREATE TABLE players (
    uid SERIAL PRIMARY KEY,
    name text NOT NULL
);

CREATE TABLE games (
    gid SERIAL PRIMARY KEY,
    player1 integer NOT NULL REFERENCES players ON DELETE CASCADE,
    player2 integer NOT NULL REFERENCES players ON DELETE CASCADE
);

CREATE TABLE moves (
    mid BIGSERIAL PRIMARY KEY,
    uid integer NOT NULL REFERENCES players ON DELETE CASCADE,
    gid integer NOT NULL REFERENCES games ON DELETE CASCADE,
    played timestamptz NOT NULL
);

Let's assume that 2 players, Alice and Bob have played 3 games with each other:

INSERT INTO players (name) VALUES ('Alice'), ('Bob');
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);

And let's assume that the 1st game was played quickly, with moves being played every minute.

But then they chilled :-) and played 2 slow games, with moves every 10 minutes:

INSERT INTO moves (uid, gid, played) VALUES
(1, 1, now() + interval '1 min'),
(2, 1, now() + interval '2 min'),
(1, 1, now() + interval '3 min'),
(2, 1, now() + interval '4 min'),
(1, 1, now() + interval '5 min'),
(2, 1, now() + interval '6 min'),

(1, 2, now() + interval '10 min'),
(2, 2, now() + interval '20 min'),
(1, 2, now() + interval '30 min'),
(2, 2, now() + interval '40 min'),
(1, 2, now() + interval '50 min'),
(2, 2, now() + interval '60 min'),

(1, 3, now() + interval '110 min'),
(2, 3, now() + interval '120 min'),
(1, 3, now() + interval '130 min'),
(2, 3, now() + interval '140 min'),
(1, 3, now() + interval '150 min'),
(2, 3, now() + interval '160 min');

At a web page with gaming statistics I would like to display average time passing between moves for each player.

So I suppose I have to use the LAG window function of PostgreSQL.

Since several games can be played simultaneously, I am trying to PARTITION BY gid (i.e. by the "game id").

Unfortunately, I get a syntax error window function calls cannot be nested with my SQL query:

SELECT AVG(played - LAG(played) OVER (PARTITION BY gid order by played))
OVER (PARTITION BY gid order by played)
FROM moves
-- trying to calculate average thinking time for player Alice
WHERE uid = 1;

UPDATE:

Since the number of games in my database is large and grows day by day, I have tried (here the new SQL Fiddle) adding a condition to the inner select query:

SELECT AVG(played - prev_played)
FROM (SELECT m.*,
      LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
      FROM moves m
      JOIN games g ON (m.uid in (g.player1, g.player2))
      WHERE m.played > now() - interval '1 month'
     ) m
WHERE uid = 1;

However for some reason this changes the returned value quite radically to 1 min 45 sec.

And I wonder, why does the inner SELECT query suddenly return much more rows, is maybe some condition missing in my JOIN?

UPDATE 2:

Oh ok, I get why the average value decreases: through multiple rows with same timestamps (i.e. played - prev_played = 0), but how to fix the JOIN?

UPDATE 3:

Nevermind, I was missing the m.gid = g.gid AND condition in my SQL JOIN, now it works:

SELECT AVG(played - prev_played)
FROM (SELECT m.*,
      LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
      FROM moves m
      JOIN games g ON (m.gid = g.gid AND m.uid in (g.player1, g.player2))
      WHERE m.played > now() - interval '1 month'
     ) m
WHERE uid = 1;

Upvotes: 3

Views: 233

Answers (2)

Juan Carlos Oropeza
Juan Carlos Oropeza

Reputation: 48207

Probably @gordon answer is good enough. But that isn't the result you ask in your comment. Only works because the data have same number of rows for each game so average of games is the same as complete average. But if you want average of the games you need one additional level.

With cte as (
    SELECT gid, AVG(played - prev_played) as play_avg
    FROM (select m.*,
                 lag(m.played) over (partition by gid order by played) as prev_played
          from moves m      
         ) m
    WHERE uid = 1
    GROUP BY gid
)
   SELECT AVG(play_avg)
   FROM cte
;

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1271231

You need subqueries to nest the window functions. I think this does what you want:

select avg(played - prev_played)
from (select m.*,
             lag(m.played) over (partition by gid order by played) as prev_played
      from moves m
     ) m
where uid = 1;

Note: The where needs to go in the outer query, so it doesn't affect the lag().

Upvotes: 2

Related Questions