DugoutSoccer
DugoutSoccer

Reputation: 422

SQL Distinct - Get all values

Thanks for looking, I'm trying to get 20 entries from the database randomly and unique, so the same one doesn't appear twice. But I also have a questionGroup field, which should also not appear twice. I want to make that field distinct, but then get the ID of the field selected.

Below is my NOT WORKING script, because it does the ID as distinct too which

SELECT DISTINCT `questionGroup`,`id` 
FROM `questions` 
WHERE `area`='1' 
ORDER BY rand() LIMIT 20

Any advise is greatly appreciated!

Thanks

Upvotes: 0

Views: 111

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

Try doing the group by/distinct first in a subquery:

select *
from (select distinct `questionGroup`,`id`
      from `questions`
      where `area`='1'
     ) qc
order by rand() 
limit 20

I see . . . What you want is to select a random row from each group, and then limit it to 20 groups. This is a harder problem. I'm not sure if you can do this accurately with a single query in mysql, not using variables or outside tables.

Here is an approximation:

select *
from (select `questionGroup`
             coalesce(max(case when rand()*num < 1 then id end), min(id)) as id
      from `questions` q join
            (select questionGroup, count(*) as num
             from questions
             group by questionGroup
            ) qg
            on qg.questionGroup = q.questionGroup
      where `area`='1'
      group by questionGroup
     ) qc
order by rand() 
limit 20

This uses rand() to select an id, taking, on average two per grouping (but it is random, so sometimes 0, 1, 2, etc.). It chooses the max() of these. If none appear, then it takes the minimum.

This will be slightly biased away from the maximum id (or minimum, if you switch the min's and max's in the equation). For most applications, I'm not sure that this bias would make a big difference. In other databases that support ranking functions, you can solve the problem directly.

Upvotes: 2

T I
T I

Reputation: 9943

Something like this

SELECT DISTINCT *
FROM (
    SELECT `questionGroup`,`id` 
    FROM `questions`
    WHERE `area`='1' 
    ORDER BY rand()
) As q
LIMIT 20

Upvotes: 0

Related Questions