Reputation: 35
I've a MySQL question
I've two tables (posts and authors) in a one to many relationship (since each post is written by an author and an author can write multiple posts).
So here are the tables:
Authors: id:BIGINT, name:VARCHAR(255) Posts: id:BIGINT, author_id:BIGINT, body:TEXT
I've got 700,000 posts and 60,000 authors.
If I choose an author (e.g. author_id = 45) and I want a random post written by him I write:
SELECT * FROM Posts WHERE author_id = 45 ORDER BY RAND() LIMIT 1;
I know this is right, but when I got 4,000 simultaneous people online it takes about 6 secs..
Maybe indexing author_id column in Posts table would speed up things?
Thank you all! :)
Upvotes: 2
Views: 150
Reputation: 1588
if author_id is foreign key then it does not need to create index. It has built-in index.
Upvotes: 0
Reputation: 75125
A [possibly clustered] index on Author_id will definitively help.
There appears to be an additional risk factor with the ORDER BY RAND() part. Essentially this clause causes SQL to dynamically assign a random number to each row (for a given Author_id), and to order these. This could become a bottleneck as some prolific authors start having hundred and thousands of posts.
Upvotes: 0
Reputation: 2377
Indexing should mirror your most popular WHERE clause scenarios.
In this particular case, create your index, then change your query to this:
SELECT id,author_id,body
FROM Posts
WHERE author_id = 45
ORDER BY RAND()
LIMIT 1;
This will prevent a schema lookup prior to the search thereby increasing performance.
SELECT * is evil for high frequency queries.
Upvotes: 5
Reputation: 13075
Especially in a situation where you read your data a lot more than you update your data, be generous when setting up indexing. Anything you'll ever have in a where clause should be indexed.
Upvotes: 0
Reputation: 6485
If you haven't and index on author_id, definitely put one on it. Also I'm not sure ORDER BY RAND() is not responsible for the performance drawback. Try adding the index and it should already improve dramatically.
Upvotes: 0
Reputation: 70983
Yes, you definitely should add the index.
CREATE INDEX Post_author_id ON Posts(author_id);
As further evidence, run
EXPLAIN SELECT * FROM Posts WHERE author_id = 45 ORDER BY RAND() LIMIT 1;
Upvotes: 2