Reputation: 1527
I'm trying to run this query. It's very slow. After 30sec I still have no results.
I have indexes on step, source, video_id.
step 1 is video uploaded
step 4 is video ready to download
source 5 is a video type
In encoding_history table, I have many times the same video_id with different steps of process (uploading, encoding, ...)
My need is to get all videos uploaded more than 6 hours ago and still not ready to download.
If I remove the subquery, it's pretty fast.
My table have about 6.000.000 entries.
$query = 'SELECT video_id FROM encoding_history WHERE
video_id NOT IN (SELECT video_id FROM encoding_history WHERE step = 4 AND source = 5 GROUP BY video_id)
AND step = 1
AND source = 5
AND date_added > DATE_SUB(NOW(), INTERVAL 6 HOUR)';
Any ideas on how to speed up my results? Thanks.
Upvotes: 0
Views: 120
Reputation: 92
There is no way that a video is in step 1 and step 4 in the same time so just get rid of the sub query
On another thaught (thanks to xQbert) you could move this operation to your application instead of the db.
Upvotes: 0
Reputation: 35333
Execution plan will help the most as you may be missing indexes that would help but this I think would operate faster.
SELECT Eh1.video_id
FROM encoding_history EH1
LEFT JOIN Encoding_history EH2
on EH2.step=4 and EH2.source=5
and EH1.step=1 and EH1.source=5
AND EH1.date_added > DATE_SUB(NOW(), INTERVAL 6 HOUR)
WHERE Eh2.Video_ID is null
Result: Return all Video_IDS that are in step 1 with a source of 5 that do not have a step 4 source 5. That were added within the past 6 hours.
Upvotes: 1