How to generate random dates based on the probability of the days in Python?

Question

I would like to generate a random list of length n based on the dates of, say, September. So, you have your list like this:

september = ["01/09/2019","02/09/2019",...,"30/09/2019"]

And I would like to generate a list that contains, say, 1000 elements taken randomly from september like this:

dates = ["02/09/2019","02/09/2019","07/09/2019",...,"23/09/2019"]

I could use something like:

dates = np.random.choice(september,1000)

But the catch is that I want dates to be selected based on the probabilities of the days of the week. So for example, I have a dictionary like this:

days = {"Monday":0.1,"Tuesday":0.4,"Wednesday":0.1,"Thursday":0.05,"Friday":0.05,"Saturday":0.2,"Sunday":0.1}

So as "01/01/2019" was Sunday, I would like to choose this date from september with probability 0.1.

My attempt was to create a list whose first element is the probability of the first date in september and after 7 days this probability repeats and so on, like this:

p1 = [0.1,0.1,0.4,0.1,0.05,0.05,0.2,0.1,0.1,0.4,0.1,0.05,0.05,...]

Obviously this doesn't add to 1, so I would do the following:

p2 = [x/sum(p1) for x in p1]

And then:

dates = np.random.choice(september,1000,p=p2)

However, I am not sure this really works... Can you help me?

Henry Yik · Accepted Answer

Actually I think your approach is fine. But instead of using the dates, first get a list of dates grouped by weekdays:

import numpy as np
import datetime
from collections import defaultdict

days = {"Monday":0.1,"Tuesday":0.4,"Wednesday":0.1,"Thursday":0.05,"Friday":0.05,"Saturday":0.2,"Sunday":0.1}

date_list = [(datetime.datetime(2019, 9, 1) + datetime.timedelta(days=x)) for x in range(30)]

d = defaultdict(list)

for i in date_list:
    d[i.strftime("%A")].append(i)

Now pass this to np.random.choice:

np.random.seed(500)

result = np.random.choice(list(d.values()),
                          p=[days.get(i) for i in list(d.keys())],
                          size=1000)

You now have a list of lists of weighted datetime objects. Just do another random.choice for the items inside:

final = [np.random.choice(i) for i in result]

How to generate random dates based on the probability of the days in Python?

Answers (2)

Related Questions