FreeBird
FreeBird

Reputation: 721

Counting records with a time instant within a start and end time

Suppose I have a table like this:

CREATE TABLE foo (
    gid BIGINT PRIMARY KEY,
    starttime BIGINT,
    endtime BIGINT
);

This table stores the start and end times of a series of games (in "seconds from epoch" format). Now, I want to know how many games were running at a give instant. This is the natural query:

SET @t = UNIX_TIMESTAMP('2012-07-12 12:00:00');
SELECT COUNT(f.gid) FROM foo f WHERE @t BETWEEN f.starttime AND f.endtime;

The complication is that I need to do this every five minutes (each game lasts only a couple of minutes, and we there are several thousand every hour) and likely for a six-month duration. I do have the procedure to loop over the date range I am interested in and generate the @t for five-minute intervals. The problem is that the query is far too slow. I am currently storing all @t in a separate table which I have indexed, like this:

CREATE TABLE bar (
    interval BIGINT PRIMARY KEY
);

So the query I now have is:

SELECT b.interval, COUNT(f.gid)
FROM bar b LEFT JOIN foo f
    ON b.interval BETWEEN f.starttime AND f.endtime
GROUP BY b.interval;

This is too slow, and no amount on indexing on the table "foo" seems to help. This I feel is a standard problem with perhaps a standard query pattern, so would appreciate any help here.

Upvotes: 4

Views: 1176

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269663

Alas, this is much easier with windows/analytic functions.

In mysql, you can use the following strategy to solve the problem.

Create a temporary table of start times. In the start temporary table, have an auto-incrementing id that keeps track of the row number. Then insert the start times in order.

Create a temoprary table of end times. In the end temporary table, have an auto-incrementing id that keeps track of the row number. Then insert the end times in order.

These tables have the accumulated number of starts and stops since the beginning of time.

Assuming that each 5-minute interval has at least one start and stop, we can query this table using:

select t.time, s.seqnum - e.seqnum
from (select <cast datetime to 5-minute interval> as time, seqnum
      from starts
     ) s full outer join
     (select <cast datetime to 5-minute interval> as time, seqnum
      from ends
     ) e
     on s.time = e.time

This is taking the cumulative number of starts and subtracting the cumulative number of stops, for each interval. I'm guessing that you know better than I do how to truncate/round a time to the nearest 5 minutes in mysql.

Upvotes: 0

paul
paul

Reputation: 22001

It strikes me that once an interval has passed, you shouldn't really need to calculate the number of running games again.

so why not add another column to bar?

CREATE TABLE bar (
    interval BIGINT PRIMARY KEY,
    runningGames INT)

that way, you only have to set a scheduled task to run every five minutes

INSERT into bar 
SELECT  UNIX_TIMESTAMP(NOW()),
        COUNT(*) 
FROM    foo
WHERE   endtime is null

and then you shouldn't have to recreate six months worth of data again and again

Upvotes: 1

Related Questions