Reputation: 2729
Using SQL Server 2016+
I have been having some difficulty in selecting random rows from a table which has been narrowed down to an issue with how random numbers are generated. For efficiency using a TOP 10 * ORDER BY NEWID()
is to slow. In the code below I have used a set seed for repeatable results, but in live I would not.
Code sample:
SELECT RAND(100) RN
, RAND()
, RAND()
, RAND()
, RAND();
SELECT RAND(100) RN
UNION ALL SELECT RAND()
UNION ALL SELECT RAND()
UNION ALL SELECT RAND()
UNION ALL SELECT RAND();
WITH cte AS
(SELECT 1 ID
, RAND(100) RN
UNION ALL
SELECT cte.ID + 1
, RAND()
FROM cte
WHERE ID < 5)
SELECT RN
FROM cte;
Result Set
RN
---------------------- ---------------------- ---------------------- ---------------------- ----------------------
0.715436657367485 0.28463380767982 0.0131039082850364 0.28769876521071 0.100505471175005
(1 row affected)
RN
----------------------
0.715436657367485
0.28463380767982
0.0131039082850364
0.28769876521071
0.100505471175005
(5 rows affected)
RN
----------------------
0.715436657367485
0.28463380767982
0.28463380767982
0.28463380767982
0.28463380767982
(5 rows affected)
As you can see from the results when I call the RAND()
function repeated in-line or via a union I get different results with each call. If however I call the function inside an iterative cte then I get the same value repeated.
This code is an example showing the problem, rather than the entire code set. I have created this purely as an example to demonstrate the problem. I have a solution based around a combination of Checksum
& NewID()
and the modulus calls and multiplication to give me values in the range I want, but this is fairly complex and seems excessive for just generating a list of random numbers in a range.
I'm looking for any guidance that can be offered on
Many thanks.
Upvotes: 3
Views: 562
Reputation: 1270301
RAND()
returns a "constant" value within a query. That is, it is evaluated once per "mention" in the query.
You can see this if you run:
select rand(), rand()
from (values (1), (2), (3)) v(x);
Each row has two values, which are different. However, the columns have the same values.
In any case, the canonical answer is to use RAND(CHECKSUM(NEWID()))
. This provides a random seed for RAND()
on each invocation:
WITH cte AS
(SELECT 1 as ID, RAND(CHECKSUM(NEWID())) as RN
UNION ALL
SELECT cte.ID + 1, RAND(CHECKSUM(NEWID())) as RN
FROM cte
WHERE ID < 5
)
SELECT RN
FROM cte;
This is probably good enough to get a list of random values. I suspect that your real problem is different -- something like extracting random rows from a table. If that is the real problem, then ask a new question or check similar questions.
Upvotes: 1