Alex Reynolds
Alex Reynolds

Reputation: 96967

Generating constrained DISTINCT elements with SQL RAND()

I have a SQL (MySQL SQL) statement, something like the following, to grab 10 records from table bar:

SET @foo:=1;
SELECT DISTINCT
    @start:=ROUND(RAND()*100),
    @start+ROUND(RAND()*100)+1,
    CONCAT("id-",@foo:=@foo+1)
FROM
    bar
LIMIT 10

It generates unique elements like:

...
14   72   id-2
...
9    10   id-8
...

So far, so good. But what I would really like to do is use the DISTINCT keyword in a way such that duplicates do not get added, based on the uniqueness of the first or first-and-second column values only.

For example, at present, all 10 records are unique because the id-n element is always different for each record. So this is possible:

...
9    10   id-4
...
9    10   id-8
...

What I would like to do is keep the second 9 10 id-8 element from being in the result set (or, alternatively, 9 10 id-4 — I don't care which near-duplicates are left out). Is there a way to constrain the DISTINCT to the first or first-and-second columns?

I know I could leave out the id-n value from the SQL statement, and then post-process the results with downstream scripts that add that back in. However, out of curiosity, I'd like to find a solution contained within a SQL query. Thanks!

Upvotes: 2

Views: 210

Answers (1)

sgeddes
sgeddes

Reputation: 62831

Well, I'm not 100% sure this is what you're trying to do, but it seems you could put the results of your query in a subquery, then use MAX to get the last record.

Maybe something like this:

SELECT rand1col, rand2col, CONCAT("id-",Max(fooidcol)) as col3
FROM 
(
    SELECT DISTINCT
        @start:=ROUND(RAND()*100) as rand1col,
        @start+ROUND(RAND()*100)+1 as rand2col, 
        @foo:=@foo+1  as fooidcol
    FROM bar
) t
GROUP BY rand1col, rand2col
LIMIT 10

The problem with testing this is RAND() won't return me what I need for testing, so I created this SQL Fiddle with actual columns that try to do just that, but it appears to work with the above query as well.

And here is the Fiddle with the above query -- works some of the time, just keep refreshing.

Upvotes: 1

Related Questions