Nathanael
Nathanael

Reputation: 7183

MySQL GROUP BY rand()?

So, interesting problem I've run into. I'm sure there's an easy solution, but I'm not sure what it is. :)

Basically, imagine a very simple database, like so:

----------------
T1
----------------
  r  |  nID
---------------
  1  |   A
  2  |   B
----------------




----------------
T2
----------------
 nID |  val
---------------
  A  |   XXX
  B  |   L
  B  |   M
  B  |   N
  B  |   P
----------------

Basically, Table 2 references Table 1. Now, I'd like to select a random row from either A or B. However, I would like to first randomize the A and B, THEN choose an associated value.

In other words, flip a coin: Heads, XXX. Tails, L, M, N, or P.

My current query joins the two tables, orders by RAND(), and then LIMIT 1. However, this makes the odds of a B value being chosen much more likely than an A value being chosen. I'm using PHP, so I could easily just run two queries, but running one query would be much tidier, so I want to see what you guys recommend.

Any solutions? =)

EDIT:

Here's my current query, but it isn't working. Not sure why!

SELECT *
FROM t2
WHERE
    nID =
    (
        SELECT nID
        FROM t1
        ORDER BY RAND()
        LIMIT 1
    )
ORDER BY RAND()
LIMIT 1

EDIT 2:

To demonstrate the issue I'm having, I created a test case. First, I created the following tables:

T1 and T2

I want the odds of selecting XXX to be identical to selecting L, M, N, or P. The query I have should do it, right? So I tested it. The followings script runs the query 5000 times, and counts the results. They should be about 50-50, with XXX showing up approximately 2500 times, and everything else showing up about 2500 times as well.

$a = 0;
$b = 0;
$i = 0;
while ($i < 5000)
{
    $query = mysql_query("
        SELECT *
        FROM t2
        WHERE
            nID =
            (
                SELECT nID
                FROM t1
                ORDER BY RAND()
                LIMIT 1
            )
        ORDER BY RAND()
        LIMIT 1
        ") or die(mysql_error());

    $result = mysql_fetch_array($query);
    if ($result['val'] == 'XXX')
    {
        $a++;
    }
    else
    {
        $b++;
    }

    $i++;
}

echo "XXX - $a<br />";
echo "Other - $b<br />";

Here are the results:

XXX - 937
Other - 4063

Let's run it again.

XXX - 968
Other - 4032

And let's run it one more time.

XXX - 932
Other - 4068

This is hardly the 50-50 split we'd expect to see given my query. What on earth's going on? Thanks for your help, guys!

Upvotes: 2

Views: 2179

Answers (4)

Joachim Isaksson
Joachim Isaksson

Reputation: 181077

Your example inner query is evaluated multiple times, if you want it to choose A or B once, you need to rewrite it, for example as a JOIN;

SELECT q2.nID, q2.val
FROM ( SELECT nID FROM T1 ORDER BY RAND() LIMIT 1 ) q1
JOIN T2 q2 ON q1.nID = q2.nID
ORDER BY RAND()
LIMIT 1

If you're working with small tables, this query should be ok, but read here for example on why you shouldn't use ORDER BY RAND() for large tables.

Demo here.

Upvotes: 2

Andrew
Andrew

Reputation: 4624

You would expect that the subquery in your question would be run once per outer query, but it looks like that's not the case. I think the below might give you what you're after:

SET @randID = (SELECT nID
FROM T1
ORDER BY RAND()
LIMIT 1);

SELECT VAL
FROM T2
WHERE nID = @randID
ORDER BY RAND()
LIMIT 1;

(SQL Fiddle)

Upvotes: 2

Hammerite
Hammerite

Reputation: 22350

SELECT
    CASE rq.r WHEN '1' THEN t1q.r ELSE t2q.nID END AS Col1,
    CASE rq.r WHEN '1' THEN t1q.nID ELSE t2q.val END AS Col2
FROM
    (SELECT CASE WHEN RAND() < 0.5 THEN '1' ELSE '2' END AS r) AS rq
    JOIN (SELECT * FROM T1 ORDER BY RAND() LIMIT 1) as t1q
    JOIN (SELECT * FROM T2 ORDER BY RAND() LIMIT 1) as t2q

Observation: This query is inefficient because it requires selecting a random row from both tables even though only one is used. Perhaps a better way exists.

Upvotes: 0

Er. Anurag Jain
Er. Anurag Jain

Reputation: 1793

Please try query given below

SELECT `table2`.*  FROM `table2` WHERE table2.field1 = (Select table1.field2 from table1 order by RAND() limit 0,1) LIMIT 0,1

Here i assume column name field1 and field2 for both table so please use field name according your table structure.

thanks

Upvotes: 1

Related Questions