Reputation: 41
I am playing around with a Netflix dataset from Kaggle on MySQL. The dataset contains fields such as Country, Subscription_Type, Age, Gender, Monthly_Revenue, etc. In the Subscription_Type column, the possible return fields are Basic, Standard, Premium. I am trying to find the most common subscription type for each country, so I'd ideally have one result per country.
SELECT *
FROM (
SELECT Country,
Subscription_Type,
COUNT(*) as count_subscription,
DENSE_RANK() OVER (PARTITION BY Country ORDER BY (SELECT COUNT(*) as count_subscription FROM netflix) ) as `rnk`
FROM netflix
GROUP BY Country, Subscription_Type
) as subscription_rank
WHERE rnk = 1;
The problem I'm having is that Dense_Rank is ranking each row as 1, so I end up getting three results per country. How can I fix it so that I'm able to get the most popular subscription type per country?
Upvotes: 2
Views: 163
Reputation: 72194
The subquery is nonsense: it's not correlated so returns the same result on every row.
Moreover, it's not necessary. Window functions work after aggregation, so you can just use COUNT(*)
directly.
SELECT *
FROM (
SELECT nf.Country,
nf.Subscription_Type,
COUNT(*) as count_subscription,
DENSE_RANK() OVER (PARTITION BY nf.Country ORDER BY COUNT(*)) as rnk
FROM netflix nf
GROUP BY
nf.Country,
nf.Subscription_Type
) as subscription_rank
WHERE rnk = 1;
I might be wrong, but maybe you wanted COUNT(*) DESC
? Also DENSE_RANK
will return tied results, whereas ROW_NUMBER
does not.
Upvotes: 0