john_stam
john_stam

Reputation: 61

Subquery in from clause using HQL

I have a table articles with 500k rows. An article has a list of authors. I am trying to create a query to get the latest published article for a list of authors.

I used the following HQL query which gets me what i want but runs pretty slow (~4s)

            select author, article
            from Article article inner join article.authors author
            where (author.id, article.publishedAt) in 
            (select author.id, max(article.publishedAt) 
            from Article article join article.authors author
            where author.id in (authors_list))
            group by author.id

A possible better query in plain sql would be:

              select * from (
                select articles.id, author.id
                from articles, article_authors, authors
                where articles.id = article_authors.article_id and 
                    article_authors.author_id=authors.id    
                    and author.id in (author_list)  
                    order by articles.publishedAt desc
              ) b
              group by authors.id;

but from the Hibernate docs it is stated that HQL subqueries can occur only in the select or where clauses. http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/queryhql.html#queryhql-subqueries

Is there a way to emulate this kind of query using HQL or another way to enhance the performance of the query?

Upvotes: 6

Views: 8950

Answers (1)

jswan
jswan

Reputation: 90

In either scenario, you want to try to segregate the data you are using to compare if it is large. in the first query above, where you have:

in 
            (select author.id, max(article.publishedAt) 
            from Article article join article.authors author
            where author.id in (authors_list))

Try putting that statement into a temp table first, and then using that small set of data for efficiency. So it would look like:

select author.id, max(article.publishedAt) into #temp1
                from Article article join article.authors author
                where author.id in (authors_list))

 select author, article
            from Article article inner join article.authors author
            where (author.id, article.publishedAt) in 
            (select author.id, article.publishedAt 
            from #temp1)
            group by author.id

Because the calculations are done and then dataset is smaller, it should improve performance.

Upvotes: 0

Related Questions