Reputation: 23
I have a table in SQL (using postgres) with two columns: videoid , classification
videoid | classification
1 |20
1 |24
1 |24
1 |24
1 |24
2 |20
2 |20
2 |20
2 |20
2 |24
2 |24
3 |24
3 |24
I am trying to retrieve all videoid's where the most common classification is 24. (The answer should only be videoid 1 and 3) When i use the query: (found from How to select most frequent value in a column per each id group?)
SELECT DISTINCT ON (videoid) videoid, most_frequent_species FROM (
SELECT videoid, classification AS most_frequent_species, count(*) as _count
FROM userviewed
GROUP BY videoid, classification) a
ORDER BY videoid, _count DESC
;
I retrieve the result set:
videoid | most_frequent_species
1 |24
2 |20
3 |24
But when i try to add a WHERE CLAUSE:
WHERE classification = 24
i get:
videoid| most_frequent_species
1 |24
2 |24
3 |24
How do I create a query that would only retrieve
videoid | most_frequent_species
1 |24
3 |24
Upvotes: 2
Views: 119
Reputation: 1269563
One method uses distinct on
and then filtering:
select *
from (select distinct on (videoid) videoid, category, count(*)
from userviewed
group by videoid, category
order by videoid, count(*) desc
) vc
where category = 24;
Upvotes: 1
Reputation: 3102
You can make use of the having
clause (essentially a post-group where clause), and the mode
function:
select
videoid
from
userviewed
group by
videoid
having
mode() within group (order by classification) = 24
Upvotes: 2