Lav
Lav

Reputation: 99

selecting random rows based on weight on another row

I need to select random rows from a table based on weight in another row. Like if the user enters random value 50 I need to select 50 random rows from the table being that the rows with higher weight gets returned more number of times. I saw using NEWID() to select n number of random rows and this link

Random Weighted Choice in T-SQL

where we can select one row based on the weight from another row but I need to select several rows based on user random input number ,so will the best way be using the suggested answer in the above link and looping over it n number of times(but I think it would return the same row) is there any other easy solution.

MY table is like this

ID Name Freq
1  aaa  50
2  bbb  30
3  ccc  10

so when the user enters 50 I need to return 50 random names so it should be like more aaa ,bbb than ccc.Might be like 25 aaa 15 bbb and 10 ccc. Anything close to this will work to.I saw this answer but when I execute against my DB it seems to be running for 5mins and no results yet. SQL : select one row randomly, but taking into account a weight

Upvotes: 1

Views: 1477

Answers (2)

chezy525
chezy525

Reputation: 4174

I think the difficult part here is getting any individual row to potentially appear more than once. I'd look into doing something like the following:

1) Build a temp table, duplicating records according to their frequency (I'm sure there's a better way of doing this, but the first answer that came to my mind was a simple while loop... This particular one really only works if the frequency values are integers)

create table #dup
(
    id  int,
    nm  varchar(10)
)

declare @curr int, @maxFreq int
select @curr=0, @maxFreq=max(freq)
from tbl

while @curr < @maxFreq
 begin
    insert into #dup
    select id, nm
    from tbl
    where freq > @curr

    set @curr = @curr+1
 end

2) Select your top records, ordered by a random value

select top 10 *
from #dup
order by newID()

3) Cleanup

drop table #dup

Upvotes: 2

beiller
beiller

Reputation: 3135

Maybe could you try something like the following:

ORDER BY Freq * rand()

in your sql? So columns with a higher Freq value should in theory get returned more often than those with a lower Freq value. It seems a bit hackish but it might work!

Upvotes: 0

Related Questions