Don
Don

Reputation: 49

Select a random row from table, but with odds?

I have a table that describes a variety of objects in my system (ie. umbrella, boots, satchel, whatever). Each one of these objects needs to have a distinct prevalence or incidence. For example, the umbrella is rarer than the boots. Based on those factors, I need to randomly select a single object (including a blank or 'no object found') based on that incidence value.

Yikes. Make sense?

Upvotes: 3

Views: 478

Answers (3)

Syntax Error
Syntax Error

Reputation: 4527

I'm going to modify symcbean's answer for this, +1 for symcbean.

SELECT * FROM some_table
WHERE (100*RAND()) < some_table.percent_probability

This will return ALL results that match the probability you intuitively want to assign to them. For example, 5 objects with a probability of 20 will all be returned 20% of the time. Objects with a value of 90 will be returned 90% of the time.

So your result will be more than one object, but you've kept the rare ones from showing up as often. So now just grab one of your results at random. An easy way would be to stick them all in an array and:

$items = array(); // assuming you've already filled $items with your 
                  // query results, one item for each array key

$count = count($items);

$chosen_key = rand(1,$count)-1;

$chosen_item = $items[$chosen_key];

Upvotes: 0

VolkerK
VolkerK

Reputation: 96159

If you have a write-seldom-read-many scenario (i.e. you change the objects and the probabilities seldom) you might want to pre-calculate the probability values so that if you have a single random value you can unambiguously decide which object to pick (with a single pick, no sorting, no comparison of all records needed).

E.g. (probabilities in per-mill)
umbrella: 500‰ chance
boots: 250‰ chance
satchel: 100‰ chance
whatever: 100‰ chance
"nothing": 50‰ chance

A random number between 0 and 499 means "umbrella" has been picked, 500-749 "boots" and so on.

INSERT INTO foo (name, randmin, randmax) VALUES
  ('umbrella', 0, 499),  
  ('boots', 500, 749),
  ('satchel', 750, 849), 
  ('whatever', 850, 949) 

Every time you add an object or modify the probabilities re-create this table.

Then all you need is a query like

SELECT
  f.name
FROM
  (  
    SELECT Round(Rand()*1000) as r    
  )  as tmp
JOIN
  foo as f  
ON
  r BETWEEN f.randmin and f.randmax  
LIMIT
  1

Only one random value has to be generated and MySQL can use an index on (randmin,randmax) to find the record quickly.

Upvotes: 1

symcbean
symcbean

Reputation: 48387

SELECT * FROM some_table
WHERE (100*RAND()) > some_table.percent_probability
LIMIT 1

....and the probability of selection is stored in the percent_probability field.

C.

Upvotes: 1

Related Questions