Imrul.H
Imrul.H

Reputation: 5870

MYSQL select 2 random rows from each categories

I have a mysql table with columns like this:

`qid`, `category`, `question`, `choice_1`, `choice_2`, `choice_3`, `answer`

In this table I have 2000 data and there are 7 different 'category'. I want to fetch 15 random rows from this table with equal number from each category. Since I have 7 categories, it's not possible to get equal number of rows from each category. In this case I can get 3 from one category. How can I achieve that with one query?

I was thinking of getting 2 rows from each categories. Then I will have 14 rows in total, and then I can get 1 random row from the table and merge records. Please show me an example query for this. Thanks. Here is my actual table structure:

CREATE TABLE IF NOT EXISTS `difi_questions` (
  `qid` smallint(6) NOT NULL AUTO_INCREMENT,
  `category` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `question` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `choice_1` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `choice_2` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `choice_3` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `answer` enum('1','2','3') COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`qid`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1959 ;

Upvotes: 1

Views: 2583

Answers (2)

Hakim
Hakim

Reputation: 1314

Try this :)

CREATE TABLE IF NOT EXISTS `difi_questions_temp` (
  `qid` smallint(6) NOT NULL ,
  `category` varchar(255)  NOT NULL
);

INSERT INTO difi_questions_temp
select qid, category from (
SELECT qid, category
FROM difi_questions 
order by rand()) as x
GROUP BY category;


INSERT INTO difi_questions_temp
select qid, category from (
SELECT qid, category
FROM difi_questions 
where qid not in (select qid from difi_questions_temp)
order by rand()) as x
GROUP BY category;

insert into difi_questions_temp 
select qid, category from difi_questions
where qid not in (select qid from difi_questions_temp)
order by rand()
limit 1 ;

/*select qid,  category from difi_questions_temp order by category; */
select * from difi_questions where qid in ( select qid from difi_questions_temp);
drop table difi_questions_temp;

Upvotes: 0

dognose
dognose

Reputation: 20899

Just fetch 2 per category as you described, and one random at the end. It is not one query, but one result-set, which might be what you need:

SELECT * FROM (SELECT * FROM questions WHERE category= 1 ORDER BY rand() limit 0,2) as t1
UNION
SELECT * FROM (SELECT * FROM questions WHERE category= 2 ORDER BY rand() limit 0,2) as t2
UNION 
SELECT * FROM (SELECT * FROM questions WHERE category= 3 ORDER BY rand() limit 0,2) as t3
UNION
...

(The nested Select allows you to sort by rand() per category) Nothing special so far - 2 random questions per category.

The tricky part now is to add the 15th, element WITHOUT selecting any of those you already have.

To achieve this with "one" call, You can do the following:

  • Take the subset of 14 questions you have selected like above.
  • Union this with a uncategorized set of random sorted things from the database. (limit 0,15)
  • Select all from this result, limit 0,15.

  • IF the first 14 elements of the LAST subquery are already selected - they will be removed due to UNION, and a independent 15th element is guaranteed.

  • If the final inner query selects 15 distinct questions as well, the outer limit 0,15 will only take the first of them into the result.

Something like:

SELECT * FROM (
    SELECT * FROM (SELECT * FROM questions WHERE category= 1 ORDER BY rand() limit 0,2) as t1
    UNION
    SELECT * FROM (SELECT * FROM questions WHERE category= 2 ORDER BY rand() limit 0,2) as t2
    UNION 
    SELECT * FROM (SELECT * FROM questions WHERE category= 3 ORDER BY rand() limit 0,2) as t3
    UNION
    ...
    UNION
    SELECT * FROM (SELECT * FROM questions ORDER BY rand() LIMIT 0,15) as t8
) AS tx LIMIT 0,15

This is somewhat ugly, but should exactly do what you need: 2 random questions from EACH category, and finally a random question that has NOT been selected already from ANY category. A total of 15 questions at any time.

(Sidenode: You could as well run a second query, using NOT IN () to dissallow already selected questions after determining the 14 questions for the 7 categories.)

Edit: Unfortunately SQL Fiddle is not working at the moment. Here's some fiddle code:

CREATE TABLE questions (id int(10), category int(10), question varchar(20));

INSERT INTO questions (id, category, question)VALUES(1,1,"Q1");
INSERT INTO questions (id, category, question)VALUES(2,1,"Q2");
INSERT INTO questions (id, category, question)VALUES(3,1,"Q3");
INSERT INTO questions (id, category, question)VALUES(4,2,"Q4");
INSERT INTO questions (id, category, question)VALUES(5,2,"Q5");
INSERT INTO questions (id, category, question)VALUES(6,2,"Q6");
INSERT INTO questions (id, category, question)VALUES(7,3,"Q7");
INSERT INTO questions (id, category, question)VALUES(8,3,"Q8");
INSERT INTO questions (id, category, question)VALUES(9,3,"Q9");
INSERT INTO questions (id, category, question)VALUES(10,4,"Q10");
INSERT INTO questions (id, category, question)VALUES(11,4,"Q11");
INSERT INTO questions (id, category, question)VALUES(12,4,"Q12");
INSERT INTO questions (id, category, question)VALUES(13,5,"Q13");
INSERT INTO questions (id, category, question)VALUES(14,5,"Q14");
INSERT INTO questions (id, category, question)VALUES(15,5,"Q15");
INSERT INTO questions (id, category, question)VALUES(16,6,"Q16");
INSERT INTO questions (id, category, question)VALUES(17,6,"Q17");
INSERT INTO questions (id, category, question)VALUES(18,6,"Q18");
INSERT INTO questions (id, category, question)VALUES(19,7,"Q19");
INSERT INTO questions (id, category, question)VALUES(20,7,"Q20");
INSERT INTO questions (id, category, question)VALUES(21,7,"Q21");

Query

SELECT * FROM (
    SELECT * FROM (SELECT * FROM questions WHERE category= 1 ORDER BY rand() limit 0,2) as t1
    UNION 
    SELECT * FROM (SELECT * FROM questions WHERE category= 2 ORDER BY rand() limit 0,2) as t2
    UNION 
    SELECT * FROM (SELECT * FROM questions WHERE category= 3 ORDER BY rand() limit 0,2) as t3
    UNION 
    SELECT * FROM (SELECT * FROM questions WHERE category= 4 ORDER BY rand() limit 0,2) as t4
    UNION 
    SELECT * FROM (SELECT * FROM questions WHERE category= 5 ORDER BY rand() limit 0,2) as t5
    UNION 
    SELECT * FROM (SELECT * FROM questions WHERE category= 6 ORDER BY rand() limit 0,2) as t6
    UNION 
    SELECT * FROM (SELECT * FROM questions WHERE category= 7 ORDER BY rand() limit 0,2) as t7
    UNION 
    SELECT * FROM (SELECT * FROM questions ORDER BY rand() LIMIT 0,15) as t8
) AS tx LIMIT 0,15

the example data contains 3 questions per type, leading to the result that the 15th question (last row) is ALWAYS the one remaining from a category.

Upvotes: 5

Related Questions