Steve Reno
Steve Reno

Reputation: 1384

Creating Reproducible Random Numbers Between 0 and 1 (SQL Server 2008)

I'm trying to figure out how to assign every record (row) in one of my tables a unique number (value) between 0 and 1 in SQL Server 2008. I have reviewed the rand(checksum(newid())) type methods and this appears to be appropriate with the exception that I need my set of values to be reproducible (i.e. if I run the same query multiple times I will get identical random values).

Any idea how I could go about this?

Upvotes: 0

Views: 2443

Answers (3)

hook
hook

Reputation: 1004

You will need to loop through the rows one at a time to get a different random value for each row. To make sure it is the same random sequence each time, pick a fixed seed, and call RAND(seed) with that seed once. Each call (in the same session) to RAND() after that will give you the next number in sequence for that seed, and will always produce the same sequence for a given seed.

There are different ways to do the looping (e.g. using a cursor), but here's one to get you started.

create table RandomValues
(
    Id  int,
    RandomValue float
)
go

-- Pick a value and always use the same one to get reproducible random numbers
declare @FixedSeed int
set @FixedSeed = 100

declare @Id int
declare @MaxId int

select @MaxId = MAX(Id), @Id = MIN(Id) from OriginalData

-- You don't need this value for anything, you just need a call to rand(@FixedSeed) 
-- to start the sequence with the fixed seed
declare @dummyseed float
set @dummyseed = rand(@FixedSeed)

while (@Id <= @MaxId)
  begin
    insert RandomValues
    select Id, rand()
    from   OriginalData
    where   Id = @Id

    -- Get the next Id from the original table
    select  @Id = MIN(Id)
    from    OriginalData
    where   Id > @Id
end

select od.*, rv.RandomValue 
from RandomValues rv
join OriginalData od on rv.Id = od.Id

Upvotes: 1

Rhys Jones
Rhys Jones

Reputation: 5518

As a number of people have already suggested, how about a lookup table for random numbers. If the main table has an integer key the following code could be used to generate a suitable lookup table;

;with t as (
   select 1 as i 
   union all 
   select i + 1 
   from t 
   where i < 1000) -- adjust accordingly
select 
   i, 
   cast(abs(checksum(newid())) / 2147483647.0 as float) as r -- the RAND function returns a float.
into dbo.RandomNumbers
from 
   t 
option(maxrecursion 0)

If you don't want to go down the lookup route then you need a function that will return a repeatable random number. I've tried a few different things and come up with rand(checksum(i,1.0/i)). It still needs an integer key.

;with t as (
   select 1 as i 
   union all 
   select i+1 
   from t 
   where i<1000) -- adjust accordingly
select 
   i, 
   rand(checksum(i,1.0/i)) as r
from 
   t 
option(maxrecursion 0)

Please note that neither of the above ideas addresses the unique aspect of your question, and I haven't analysed the distribution of the numbers but how important these are depends on what your doing.

Rhys

Upvotes: 0

user275683
user275683

Reputation:

Just like it was already mentioned in comments, you ether get random numbers or your get "some" numbers that appear random but can be recreated every time.

If you just need 0 or 1 you could do something like this.

SELECT ABS(CONVERT(BIGINT, HASHBYTES('SHA1', name)) % 2)
FROM sys.objects

By using Mudolo % function you will only get 0, 1, -1. Than you wrap it with ABS absolute value to get only 1 and 0.

I hope this gets you in the right direction. You can read about HASHBYTES here http://technet.microsoft.com/en-us/library/ms174415.aspx you can experiment with different available algorithms.

Upvotes: 1

Related Questions