julio
julio

Reputation: 6728

mysql table design

I am using a table to track visits/views of objects in my site. Each object can have multiple categories and multiple owners. I'd like to be able to display the statistical info as efficiently as possible, without doing big multi-table joins.

Right now I'm generating a guid for each visit-- then attaching that to a row in the hits table that also tracks the object ID, the user's ID, the category ID and the owner ID. Since there can be multiple categories and owners, there's a new row for each of those. Then when I query the table for hits, I do a select distinct for the guids to get the total number of hits per object-- since there may be multiple rows with the same guid, I don't want to count these twice.

Other tactics would be to have several extra columns for "owner2", "owner3", "category2", "category3" etc. or to simply do a join on the object table and get the owner ID(s) and category ID(s) when I generate a report. The downside to this is that it would probably be a pretty ugly query and could be slow (since the object tables could have hundreds of thousands of rows).

Any opinions on the best way to handle this?

Upvotes: 1

Views: 131

Answers (2)

user213154
user213154

Reputation:

7.8.1. Make Your Data as Small as Possible

It's not always the answer but I think it is in your case. Given that you want a report of the number of views of each object by category and of each object by owner then I think your current, relatively normal (normal as in normalized) design is good. I don't like the sound of the “other tactics”.

But there's an optimization that you could consider. Time moves only in one direction. The statistics of the past cannot change. And it seems your hits table will grow monotonically if you don't do something to control it. Can you say that a guid not seen for more than N days is old and can be removed? In that case you could keep a (guid, last_seen_timestamp) table and use it periodically to flush old guids out of the hits table while producing summary statistics for the flushed guids.

Upvotes: 1

Denis de Bernardy
Denis de Bernardy

Reputation: 78413

Other tactics would be to have several extra columns for "owner2", "owner3", "category2", "category3" etc. or to simply do a join on the object table and get the owner ID(s) and category ID(s) when I generate a report.

Your design seems fine at a glance, but if you can, you might want to investigate using PostgreSQL: it has an array type and gist indexing, which would allow you to aggregate the owners/categories as array fields and query against the latter using the overlap (&&) operator.

Upvotes: 0

Related Questions