Reputation: 4050

How to code dates for efficient query in mysql

I am wondering if anyone can help me answer this technical question. I have a list of things that can or cannot happen at any given date/time during the course of a year. Let's say there are a million items. To simplify let's say this thing can or cannot happen every half of an hour, 365 days a year. Therefore there are 17,520 possible binary values. I need to query against this in real time to say whether or not something can happen at time x. For obvious reasons it is not efficient to create a table with 17,520 columns.

Can anyone recommend any methods that would allow me to handle such data? Thanks.

Upvotes: 2

Answers (5)

O. Jones

Reputation: 108651

The SQLish way to do this is to record rows in a table that look roughly like this, with the name of the event and the timestamp.

   event     | when
   happened  | 2014-01-01 10:13
   happened  | 2014-01-01 10:33
   didnt     | 2014-01-01 11:13

Then, to get a summary of the events by hour for yesterday, you can do this:

 SELECT DATE_FORMAT(when,'%Y-%m-%d %H:00') AS when,
        SUM(IF('happened'=event,1,0)) AS happened,
        SUM(IF('didnt'=event,1,0)) AS didnt
   FROM event
  WHERE when >= CURDATE() - INTERVAL 1 DAY
    AND when <  CURDATE() 
  GROUP BY DATE_FORMAT(when,'%Y-%m-%d %H:00')
  ORDER BY DATE_FORMAT(when,'%Y-%m-%d %H:00')

Some notes here: The DATE_FORMAT(when,'%Y-%m-%d %H:00') expression reduces dates to the nearest hour. The WHERE expression picks up everything from midnight yesterday up to but not including midnight today.

If you want to summarize by half-hour, use this GROUP BY expression instead:

DATE_FORMAT(obstime,'%Y-%m-%d %H:00') + 
            INTERVAL (MINUTE(obstime) - MINUTE(obstime) MOD 30) MINUTE

In the SQL world there's no need to encode this sort of timeseries data in hyperoptimized data structures like bit strings. That's basically true in all modern computers; disk space is really cheap.

Upvotes: 0

Loopo

Reputation: 2195

I'd recommend using something like a bitmap to store this type of data.

This should be easy to manipulate and would store efficiently (possibly compressed).

retrieve the bitmap for the event and 'AND' it with your timeslot. To find out if it is a valid time or not.

Upvotes: 1

Zak

Reputation: 25205

3 tables:

table 1 : things (id, thingname)

table 2 : all possible times thing(s) from table 1 can happen times -(id,time)

table 3 : mapping table of things -> times they can happen thing_times(thing_id,time_id)

if you want to look up can thing x happen at time y

select * from thing_times inner join times on thing_times.timeid = times.id inner join 
things on thing_times.thingid = things.id where times.time = $canItHappenTime and 
things.thingname = $thingIWantToCheckTimeFor

Or you can leave out the join to table1 if you already know the id of the thing

also, make sure you add indexes on the columns (thingname and time) you are searching on so the query doesn't have to do table scans...

Upvotes: 0

digitai

Reputation: 1842

I recommend to create a time lapse dimension table, the 17520 records, which you can use to make reports for example of at what time lapses did nothing happened.

Keep in mind that more inefficient are many column tables rather than a two or three columns table like this but long on records.

Upvotes: 0

J. Born

Reputation: 101

Depending on the amount of times the event happen/doesn't happen, i would recommend saving the difference between dates in a given measurement (half of an hour, i guess?). If there's a lot of times where it doesn't happen, then you can save times it does happen and skip through times where it doesn't.

Ex,

if you have

v = 1 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1

you can save it as

v2 = 0 2 0 4 0 0 4 1 2 4

if you save this as something smaller than an int, this can also be efficient in space.

Upvotes: 0

How to code dates for efficient query in mysql

Answers (5)

Related Questions