cornsnack
cornsnack

Reputation: 23

SQL: only selecting id's where the most frequent value of a second column is a CERTAIN VALUE (Grouped by id)

I have a table in SQL (using postgres) with two columns: videoid , classification

videoid | classification
1       |20
1       |24
1       |24
1       |24
1       |24
2       |20
2       |20
2       |20
2       |20
2       |24
2       |24
3       |24
3       |24

I am trying to retrieve all videoid's where the most common classification is 24. (The answer should only be videoid 1 and 3) When i use the query: (found from How to select most frequent value in a column per each id group?)

    SELECT DISTINCT ON (videoid) videoid, most_frequent_species FROM (
  SELECT videoid, classification AS most_frequent_species, count(*) as _count
FROM userviewed
GROUP BY videoid, classification) a
ORDER BY videoid, _count DESC
;

I retrieve the result set:

videoid     | most_frequent_species
    1       |24
    2       |20
    3       |24

But when i try to add a WHERE CLAUSE:

WHERE classification = 24

i get:

videoid| most_frequent_species
1      |24
2      |24
3      |24

How do I create a query that would only retrieve

videoid | most_frequent_species
1       |24
3       |24

Upvotes: 2

Views: 119

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269563

One method uses distinct on and then filtering:

select *
from (select distinct on (videoid) videoid, category, count(*)
      from userviewed
      group by videoid, category
      order by videoid, count(*) desc
     ) vc
where category = 24;

Upvotes: 1

Scoots
Scoots

Reputation: 3102

You can make use of the having clause (essentially a post-group where clause), and the mode function:

select
    videoid
from
    userviewed
group by
    videoid
having
    mode() within group (order by classification) = 24

Upvotes: 2

Related Questions