Reputation: 13538
This question comes as a result of a limitation in Amazon Redshift, the columnar analytics database based on Paraccel. One of the unsupported features is references in a GROUP BY column to the results of a correlated subquery.
For example, the following statement will generate a Redshift error because GROUP BY uses list
which is generated by a subquery.
select listing.listid,
(select count (sales.listid) from sales where sales.listid=listing.listid) as list
from listing
group by list, listing.listid;
The following example from Gordon Linoff is another unsupported use case (an answer to a specific question that generated this general question).
select type, (case when cnt > XXX then url end) as url, sum(cnt) as visit_cnt
from (select type, url, count(*) as cnt
from t
group by type, url
) t
group by type, url
order by type, sum(cnt) desc;
The purpose of this question is to find a generic pattern for overcoming this particular Amazon Redshift correlated subquery limitation. What are the alternative SQL patterns for achieving the same outcome as using values from correlated subqueries?
Upvotes: 1
Views: 724
Reputation: 14035
A left join should do the trick, unless I'm missing something.
SELECT listing.listid
,COUNT(sales.listid)
FROM listing
LEFT JOIN sales
ON sales.listid = listing.listid
GROUP BY listing.listid
ORDER BY COUNT(sales.listid) DESC
;
Upvotes: 1