Reputation: 777
I've got a statement that inserts values into a table based on a random value and another value
INSERT INTO TRAINER_SYNC (Steps, TrainerId, SyncDate)
SELECT AverageSteps * (RAND()*(1.15-0.85)+0.85), Id, GETDATE()
FROM TRAINER
This works fine, but whenever two "Trainers" have the same "AverageSteps", the number inserted into the table is always the same.
For example, if all the trainers have average steps of 10000
, here is the result in the TRAINER_SYNC
table
Id TrainerId SyncDate Steps
10 1 2018-10-20 18:42:16.407 9482
11 2 2018-10-20 18:42:16.407 9482
12 3 2018-10-20 18:42:16.407 9482
13 4 2018-10-20 18:42:16.407 9482
14 5 2018-10-20 18:42:16.407 9482
15 6 2018-10-20 18:42:16.407 9482
16 7 2018-10-20 18:42:16.407 9482
17 8 2018-10-20 18:42:16.407 9482
18 9 2018-10-20 18:42:16.407 9482
Is there a way to get the RAND()
in the insert statement to be more... "random"? I'm not sure what it is called, but reset the initial seed so each iteration of the insert gives me a different random value?
Upvotes: 0
Views: 298
Reputation: 1269913
rand()
is a special class of function that is evaluated exactly once for each call in the SQL statement. This can be really confusing, because:
select rand(), rand()
returns two different values. However,
select rand(), rand()
from (values (1), (2), (3)) v(x)
returns three rows -- with the same two values on each of the rows. This is an optimization "feature" of SQL Server. I believe that getdate()
(and similar date/time functions) are the only other functions with this behavior.
The normal solution is to seed the random number generator with newid()
. But, the return types are different, so checksum()
is used to convert the id
to a number:
INSERT INTO TRAINER_SYNC (Steps, TrainerId, SyncDate)
SELECT AverageSteps * (RAND(CHECKSUM(NEWID()))*(1.15-0.85)+0.85),
Id, GETDATE()
FROM TRAINER;
You could use a different seed, if one is available. For instance, if only one row is going in per trainer, you could use rand(id)
. I should note that rand()
is not particularly "random" with seeds, so adjacent seeds often produce similar numbers.
You probably should not be setting GETDATE()
in the INSERT
. You can create the table with:
SyncDate datetime default getdate()
So the default value is the insertion time.
Upvotes: 1
Reputation: 1041
you can use
1)
RAND(CHECKSUM(NEWID()))
OR
2)
INSERT INTO TRAINER_SYNC (Steps, TrainerId, SyncDate)
SELECT AverageSteps * (RAND(AverageSteps)*(1.15-0.85)+0.85), Id, GETDATE()
FROM TRAINER
Upvotes: 0