Reputation: 61
I have a table articles
with 500k rows. An article has a list of authors. I am trying to create a query to get the latest published article for a list of authors.
I used the following HQL query which gets me what i want but runs pretty slow (~4s)
select author, article
from Article article inner join article.authors author
where (author.id, article.publishedAt) in
(select author.id, max(article.publishedAt)
from Article article join article.authors author
where author.id in (authors_list))
group by author.id
A possible better query in plain sql would be:
select * from (
select articles.id, author.id
from articles, article_authors, authors
where articles.id = article_authors.article_id and
article_authors.author_id=authors.id
and author.id in (author_list)
order by articles.publishedAt desc
) b
group by authors.id;
but from the Hibernate docs it is stated that HQL subqueries can occur only in the select or where clauses. http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/queryhql.html#queryhql-subqueries
Is there a way to emulate this kind of query using HQL or another way to enhance the performance of the query?
Upvotes: 6
Views: 8950
Reputation: 90
In either scenario, you want to try to segregate the data you are using to compare if it is large. in the first query above, where you have:
in
(select author.id, max(article.publishedAt)
from Article article join article.authors author
where author.id in (authors_list))
Try putting that statement into a temp table first, and then using that small set of data for efficiency. So it would look like:
select author.id, max(article.publishedAt) into #temp1
from Article article join article.authors author
where author.id in (authors_list))
select author, article
from Article article inner join article.authors author
where (author.id, article.publishedAt) in
(select author.id, article.publishedAt
from #temp1)
group by author.id
Because the calculations are done and then dataset is smaller, it should improve performance.
Upvotes: 0