Reputation: 110382
Suppose the following jobs table:
`jobpost`
- name
- position
- is_featured (boolean)
I would like to build a list of suggested jobs for a given user, where jobpost.position
matches the user's position (for example, an accountant would receive jobs in accounting).
The basic query to accomplish this would be something like:
SELECT name FROM jobpost WHERE jobpost.position IN (list of user positions) LIMIT 10
I also want to make sure that jobs that are featured (is_featured=True
) receive extra weight. Then I need to build a probility distribution list from which a random number of jobs would be selected. For this I was thinking of building a python list of tuples, with the job name and probability, and then using random.random()
. For example, something like (in pseudocode):
x = [('job 1', 0.2), ('job 2', 0.2), ('job 3', 0.2), ('job 4', 0.4)]
# pick three out of the list of jobs above
random.random.sample(x, 3)
I have three questions related to this:
Does this seem like the right approach?
How would I use the random
module (or another one) to select n number of objects with each object having a certain given probability?
In terms of giving a weighted average to a featured job over a non-featured job, would the following query be the correct approach? If not, what would be a better way?
SELECT name, 1 * (CASE WHEN is_featured=True THEN % ELSE 1) as weighted_average FROM ...
This would give me tuple with the job name and the relative weight.
Upvotes: 1
Views: 679
Reputation: 298364
I'd recommend using SQLAlchemy and make use of Python's syntax, not SQL's. But to solve your problem, here's what I'd try:
import random
x = [('job 1', 0.2), ('job 2', 0.2), ('job 3', 0.2), ('job 4', 0.2), ('job 5', 0.4)]
def random_choice(jobs_list, number=1):
jobs = jobs_list[:]
choice = random.uniform(0, 1)
total = 0
while total < number:
index, job_item = random.choice(list(enumerate(jobs)))
job, weight = job_item
if weight > random.uniform(0, 1):
yield job
jobs.pop(index)
total += 1
print list(random_choice(x, 3))
Just as a side note, in order for your "weights" to follow a discrete PDF, they need to add up exactly to 1.0
.
Truthfully, I don't think this "weighting" is a proper solution to the problem. You don't have a proper PDF to create the discrete "weight" for a given job. Instead, why not just have separate categories for jobs?
Upvotes: 1