Using SQL/Python to build probability distribution

Question

Suppose the following jobs table:

`jobpost`
 - name
 - position
 - is_featured (boolean)

I would like to build a list of suggested jobs for a given user, where jobpost.position matches the user's position (for example, an accountant would receive jobs in accounting).

The basic query to accomplish this would be something like:

SELECT name FROM jobpost WHERE jobpost.position IN (list of user positions) LIMIT 10

I also want to make sure that jobs that are featured (is_featured=True) receive extra weight. Then I need to build a probility distribution list from which a random number of jobs would be selected. For this I was thinking of building a python list of tuples, with the job name and probability, and then using random.random(). For example, something like (in pseudocode):

x = [('job 1', 0.2), ('job 2', 0.2), ('job 3', 0.2),  ('job 4', 0.4)]
# pick three out of the list of jobs above
random.random.sample(x,  3)

I have three questions related to this:

Does this seem like the right approach?
How would I use the random module (or another one) to select n number of objects with each object having a certain given probability?
In terms of giving a weighted average to a featured job over a non-featured job, would the following query be the correct approach? If not, what would be a better way?

SELECT name, 1 * (CASE WHEN is_featured=True THEN % ELSE 1) as weighted_average FROM ...

This would give me tuple with the job name and the relative weight.

Using SQL/Python to build probability distribution

Answers (1)

Related Questions