Reputation: 53931
I have a web page where users upload&watch videos. Last week I asked what is the best way to track video views so that I could display the most viewed videos this week (videos from all dates).
Now I need some help optimizing a query with which I get the videos from the database. The relevant tables are this:
video (~239371 rows)
VID(int), UID(int), title(varchar), status(enum), type(varchar), is_duplicate(enum), is_adult(enum), channel_id(tinyint)
signup (~115440 rows)
UID(int), username(varchar)
videos_views (~359202 rows after 6 days of collecting data, so this table will grow rapidly)
videos_id(int), views_date(date), num_of_views(int)
The table video
holds the videos, signup
hodls users and videos_views
holds data about video views (each video can have one row per day in that table).
I have this query that does the trick, but takes ~10s to execute, and I imagine this will only get worse over time as the videos_views
table grows in size.
SELECT
v.VID,
v.title,
v.vkey,
v.duration,
v.addtime,
v.UID,
v.viewnumber,
v.com_num,
v.rate,
v.THB,
s.username,
SUM(vvt.num_of_views) AS tmp_num
FROM
video v
LEFT JOIN videos_views vvt ON v.VID = vvt.videos_id
LEFT JOIN signup s on v.UID = s.UID
WHERE
v.status = 'Converted'
AND v.type = 'public'
AND v.is_duplicate = '0'
AND v.is_adult = '0'
AND v.channel_id <> 10
AND vvt.views_date >= '2001-05-11'
GROUP BY
vvt.videos_id
ORDER BY
tmp_num DESC
LIMIT
8
All the relevant fields are indexed.
And here is a screenshot of the EXPLAIN result:
So, how can I optimize this?
UPDATE
This is my query based on the answer by Quassnoi. It returns the correct videos, but it messes up the JOIN on the signup table. For some records the username
field is NULL, for others it contains the wrong username.
SELECT
v.VID,
v.title,
v.vkey,
v.duration,
v.addtime,
v.UID,
v.viewnumber,
v.com_num,
v.rate,
v.THB,
s.username
FROM
(SELECT
videos_id,
SUM(num_of_views) AS tmp_num
FROM
videos_views
WHERE
views_date >= '2010-05-13'
GROUP BY
videos_id
) q
JOIN video v ON v.VID = q.videos_id
LEFT JOIN signup s ON s.UID = v.VID
WHERE
v.type = 'public'
AND v.channel_id <> 10
AND v.is_adult = '0'
AND is_duplicate = '0'
ORDER BY
tmp_num DESC
LIMIT
8
Upvotes: 2
Views: 217
Reputation: 425371
Create the following index:
video_views (views_date, videos_id)
, and get rid of the LEFT JOIN
between videos
and views
(it does not work with your current query, anyway):
SELECT *
FROM (
SELECT videos_id, SUM(num_of_views) AS tmp_num
FROM video_views
GROUP BY
videos_id
) q
JOIN videos v
ON v.vid = q.videos_id
LEFT JOIN
signup s
ON s.UID = v.UID
ORDER BY
tmp_num DESC
LIMIT 8
If you want to return zero for videos that had never been viewed, change the order of fields in the index:
video_views (videos_id, views_date)
and rewrite the query:
SELECT *,
(
SELECT COALESCE(SUM(num_of_views), 0)
FROM video_views vw
WHERE vw.videos_id = v.vid
AND views_date >= '2001-05-11'
) AS tmp_num
FROM videos v
LEFT JOIN
signup s
ON s.UID = v.UID
ORDER BY
tmp_num DESC
LIMIT 8
Upvotes: 1
Reputation: 536379
Yeah, ORDER BY
on a computed column is always going to be unindexable. Sorry.
If you're going to be doing this query a lot and you want to avoid the views for each video having to be counted and ordered each time, you'll have to denormalise. Add a views_in_last_week
column, recalculate it from videos_views
in the background each day, and index it (possibly in a compound index with other relevant WHERE conditions).
Upvotes: 2