Reputation: 20356
I have the following destination
, image
and destination_image
tables:
(MYSQL):
I need to select random N images of a given size, per destination; where N is defined at runtime and destinations are selected from a range of id_destination
.
I don't need top
N images, just random N images.
Can someone point me in the right direction of what would be the correct SQL query for this ?
Upvotes: 1
Views: 995
Reputation: 48139
If you want "x" entries per destination, you can do per MySQL variables... something like
select
di.id_destination,
di.id_image,
i.filepath,
i.size,
@RandSeq := if( @lastDestination = di.id_destination, @RandSeq +1, 1 ) as FinalSeq,
@lastDestination := di.id_destination as carryForward
from
destination_image di
join image i
on di.id_image = i.id_image,
( select @lastDestination := 0,
@RandSeq := 0 ) SQLVars
where
di.id_destination between rangeStart and rangeEnd (or similar criteria for your "range" )
having
FinalSeq = 4 ( just a sample, but your "N" entries per destination desired)
order by
id_destination,
rand()
This will get 4 (or "N" in your case) entries PER Destination. The critical thing is the order by. The order by will return the records in this order FIRST, then apply the @sql variables. So, on a per destination, the rows will be RAND()omized first, but random WITHIN a single destination... and the same on the next destination, and the next... Then the @vars generate 1, 2, 3, etc for the column "FinalSeq". The "HAVING" clause restricts which rows to allow returning entries with that criteria in the final row, thus only 4 per this example.
CLARIFICATION OF SQL.
The @variables are almost like inline variables used in a SQL statement... by doing ( select @someVar := someValue, @anotherVar := '', @someDateVar := getdate() ) as SQLVars... basically just creates and initializes variables that can be set, changed and updated against every row the query is going to process... At the end of the SQL statement, the variables are then released. Some people pre-declare them as separate SET commands, THEN run the SQL-Select.. I prefer to do them in-line.
Now, how they are used... They can be used to track almost anything against any row being returned by the SQL-select statement, but if you ever need things in a certain order, the order clause is processed first, THEN passes the record on to be processed with the @vars... So, think of the @vars getting set within a program. They get set one in the order they are processed in the SQL command, and whatever the final result is, is stored into the final column name like any other similar function call, such as padding, trimming, upper, coalesce, etc.
Now, what is going on... lets take a look at the steps... Assume for sake of understanding, we have a single table, 10 rows corresponding 1=A, 2=B, 3=C --- 10=J. These are the natural order of records such as an auto-increment. Now, if you were to run a query select * from table order by rand(), you might get 3-C, 9-I, 2=B, 7=G, etc... This with a LIMIT 4, would only return the first 4 and you are done.
Now, take the same scenario of records 1-10 = A-J respectively. Now, lets expand to allow your "group" of multiple destinations and sizes, such as...
ID Ltr Dest Size
1 A X a
2 B Y a
3 C X a
4 D X a
5 E Y b
6 F X c
7 G Y b
8 H X a
9 I Y a
10 J X b
11 K X a
12 L Y a
13 M X a
14 N Y a
Now, you want things of all destinations, but a single size "a" for example... I added a few extras.
select * from SampleTable where Size="a" order by rand()
You could get all "X" destinations, or the "Y(a)" records and "X(a)" records, or other similar balance of available. However, no guarantee that you get 3 "X" and 3 "Y" records of size "a". If you change the ORDER to ORDER BY Dest, rand(), this will put all the "X" entries first, THEN "Y"s which a limit won't work... So, apply the principle of my select, you order by the destination and random, and apply a where clause of one Size = "a" and you might get... (by destination first, then randomized...
13 M X a
3 C X a
8 H X a
1 A X a
11 K X a
4 D X a
9 I Y a
14 N Y a
2 B Y a
12 L Y a
Now, apply the @variables to the mix... Just applying the @RandSeq and @lastDestination
@RandSeq := if( @lastDestination = di.id_destination, @RandSeq +1, 1 ) as FinalSeq, @lastDestination := di.id_destination as carryForward
Start Value of
ID Ltr Dest Size @RandSeq @lastDestination FinalSeq carryForward
13 M X a 0 '' 1 X (current record value of dest)
3 C X a 1 X 2 X
8 H X a 2 X 3 X
1 A X a 3 X 4 X
11 K X a 4 X 5 X
4 D X a 5 X 6 X
9 I Y a 6 X 1 Y (change to Y resets counter to 1)
14 N Y a 1 Y 2 Y
2 B Y a 2 Y 3 Y
12 L Y a 3 Y 4 Y
Now, if you apply a "HAVING" clause FinalSeq <= 3, you will get all the rows listed above with FinalSeq <= 3, and the remaining 4, 5, 6 of destination "X" will be ignored, and 4 of "Y" will be ignored... thus leaving 3 from each destination of the given size.
Hopefully this clarifies what's going on with the query.
Upvotes: 1
Reputation: 48827
Try something like ORDER BY RAND() LIMIT N
where N
is the number of results that should be returned. For example:
SELECT i.id_image FROM image i
JOIN destination_image di ON i.id_image = di.id_image
WHERE di.id_destination > N AND di.id_destination < M
AND i.size = X
ORDER BY RAND() LIMIT Y
Replace N
, M
, X
and Y
with your values.
Upvotes: 1