Joost Baaij
Joost Baaij

Reputation: 7598

The most efficient way to schedule content on a website

I have lots of content that needs to be published on a web site some time in the future. What's the most efficient way to publish it when the time comes?

My current implementation is in two datetime columns: online_at and offline_at.

The sql query to fetch content looks something like this:

SELECT * 
  FROM contents 
 WHERE online_at > current_timestamp 
   AND (offline_at IS NULL OR 
        offline_at < current_timestamp);

with an index over the online_at and offline_at columns. It works well and there are no obvious performance penalties, but I'm still wondering if there is a more efficient way to go about this. Is there a way to reduce the index to one simpler column (not a datetime, which seems expensive)?

Upvotes: 0

Views: 90

Answers (2)

user806549
user806549

Reputation:

I've seen your construction many times and I've only seen it become prohibitively slow when hitting millions of rows, so I'm not sure you should really worry about the construction.

One thing, that I haven't tried myself, but that could give you increased parallelism, is to have seperate indexes on online_atand offline_at and then using EXCEPT/MINUS (depending on your DB). In essence just use the IDs, but this could obviously be extended with the entire field list except for the dates. Ie:

SELECT id, header, text, ... 
  FROM CONTENT
 WHERE online_at < current_timestamp
 MINUS
SELECT id, header, text, ... 
  FROM CONTENT
 WHERE offline_at < current_timestamp

Upvotes: 1

gview
gview

Reputation: 15361

You did not state what database you are using, but if you are using mysql, making your columns timestamps will only use 4 bytes vs 8.

Also I would suggest insuring that offline does not default to NULL or allow NULL. Use a perpetuity date (some date in the future as the default). You can alter the table definition and set this as the default date. For a timestamp that date is '2037-12-31'. For a datetime, it is '9999-12-31'.

Now unless this is going to be a large database at some point, which is probably unlikely, saving 8 bytes per row isn't that big of a deal, but you asked ;)

If this is not a mysql question, please let me know and I'll remove this answer. If it is, you might retag.

You can also rewrite the query to be SELECT ... where NOW() between online_at AND offline_at

But again you want to eliminate the requirement that it look at IS NULL as these columns are not in the index.

Upvotes: 0

Related Questions