Reputation: 163
I have a MySQL database with a lot of tables, and I have two tables that are linked to each other in a many to many relationship, using a third table. Here's a simplified demonstration of how it looks like:
Table 1: books
Table 2: authors
Table 3: books_authors
What I am trying to do is to get a random list of authors with the latest book per each author, so I am getting a list with all author ids, using some scripting to create that random list, and then I am using the following query:
SELECT
`books`.`ID`,
`books`.`Name`
FROM `books`
LEFT JOIN `books_authors` ON `books`.`ID` = `books_authors`.`book_id`
LEFT JOIN `authors` ON `books_authors`.`author_id` = `authors`.`ID`
WHERE `authors`.`ID` IN(8, 12)
LIMIT 1
The problem is, the Limit 1 means that I will only get one book, while I want to get two books, one per each author. How can I do that without running a query for each author? (the database is huge, and a query for each author will bring the server to a crawl). If I increase the limit, then I am not necessarily getting two books one per each other, but I may get two books by the same author.
In other words, I want the limit to be on the IN, not on the entire query. Is that possible? and if not, is there an efficient way of doing what I am trying to do?
Thanks!
David
Upvotes: 4
Views: 4508
Reputation: 101
This is not exactly the answer for this question but it worked for me for similar problem.
I have a resultset from a query with columns A, B, C, D, E. As I want one line for any D+E-type result (I don't care wich one), I've done that with a "group by D,E" function.
35.000+ registers, each column was varchar(30), Mysql take 766 ms to do that work.
Upvotes: 1
Reputation: 1090
You can first select the for each author books with the latest id and join it with books table to get names. Something like that:
SELECT
`books`.`ID`,
`books`.`Name`
FROM `books`
INNER JOIN (
select max(book_id), author_id
from `books_authors`
group by author_id) ba ON `books`.`ID` = ba.`book_id`
WHERE ba.author_id IN (8, 12)
Upvotes: 2
Reputation: 4698
There are a bunch of solutions to this problem here, but the gist of it is that if your dataset is huge, you are probably best off running multiple queries, one per author.
Upvotes: 2