tom
tom

Reputation: 1077

Generate random consecutive dates with variable length in Python

I'd like to generate random consecutive dates in a given time frame. I've seen some approaches for generating random dates, and I have one attempt below in a function, but I think there is a more compact way to do it. Any help is appreciated!

The function does a few things:

  1. Creates consecutive dates from a random starting point in a date window
  2. The variable length of those consecutive dates is between 2-5 days
  3. the function ensures that the consecutive dates are chosen between the given beginning and end date.

The code is below. Is there a more compact way to generate this function?

def illness(start_date,end_date):
  funct_ill_list=[]   #list that will store all dates 

  #randomly choosing first date
  diff = end_date - start_date
  random_number = random.randint(0,diff.days-5)  
 
  
  temp = start_date + datetime.timedelta(random_number)    #temp = 1 march + "random_number" days 
  funct_ill_list.append(temp)                              #adding the first date to list

  
  #adding next 'n' (2-5) consecutive dates after our last element in list(most recently added date in list)
  # ----------------FOR EXAMPLE - funct_ill_list = ['4 march','5 march','6 march']; random_number=3
 
  while funct_ill_list[-1]<=end_date:                              #stop when last element in list (most recently added date in list) exceeds end_date
      random_number = random.randint(2,5)                           #for 2-5 random days (consecutive) - get a random integer between 2 to 5
      last = funct_ill_list[-1]+datetime.timedelta(random_number)   #'last' variable stores maximum possible date we will have with chosen random_number
      if last>end_date:   
        funct_ill_list.pop()
        break
      else:                                          #else add next 'random_number' dates to list
        ref_date = funct_ill_list[-1]                #last element of list
        for i in range(1,random_number):             #'i' takes values from 1 to (random_number-1).
          temp = ref_date + datetime.timedelta(i)    #each time add 'i' days to ref_date to get new date 'temp'
          funct_ill_list.append(temp)                #add this new date to list.
      
      #for next random date
      # --------------FOR EXAMPLE - funct_ill_list = ['4 march','5 march','6 march','25 march']
      diff = end_date - funct_ill_list[-1]                        #get no. of days remaining b/w end_date and last element added to list
      if diff.days>=2:                                            #if diff.days=0 or 1, ie. last element in list is either end_date or end_date-1,
                                                                  #No point of adding more dates for next round (since 2-5 consecutive days won't be possible), else add.
       
        random_number = random.randint(2,diff.days)                   #randomly choose a number
        temp = funct_ill_list[-1] + datetime.timedelta(random_number) #adding "random_number" days to last element of list
        funct_ill_list.append(temp)                                   #adding it to list
  return funct_ill_list



ill_start_date = datetime.date(2020, 2, 1)
ill_end_date = datetime.date(2020, 4, 30)
month_time_frame = [3,5] #static two months time frame (Looks at matching dates in March and May months)

Upvotes: 0

Views: 348

Answers (1)

pho
pho

Reputation: 25489

You don't really need much iteration and removing dates if they don't fulfil a condition, because there's not much randomness involved here. All you need are two random numbers:

  • How many days are going to be in your return value? Let's call this m, and this is an integer in the range [2, 5].
  • When in the start_date-end_date interval should we start our sequence of dates? Or, how many days after start_date should our first date be? Let's call this N.

It's pretty easy to notice that N is restricted by the number of days in our given start-end range (let's call this R)and the duration of the illness (m). Specifically, we know that N <= R - m.

Once we have N, we know that the dates we want are:

  • start_date + N days
  • start_date + N + 1 days
  • ... and so on ...
  • start_date + N + m - 1 days

This can be generated using a simple loop or list comprehension with start_date + range(N+m) days

import random
import datetime

def illness(start_date, end_date):
    # Duration of start-end range (R)
    max_days = (end_date - start_date).days

    # How many days should the illness start (m)
    illness_days = random.randint(2, min(max_days, 5))

    # Number of days after start_date that the first random date is on (N)
    illness_start = random.randint(0, max_days - illness_days)

    # Make a list of illness_days consecutive dates starting from the first illness day
    illness_dates = [start_date + datetime.timedelta(days=illness_start + d) for d in range(illness_days)] 

    return illness_dates

Running this:

sd = datetime.date(2021, 1, 1)
ed = datetime.date.today()
illness(sd, ed)

gives a random (in this case, 4) number of consecutive dates randomly located in the given interval:

[datetime.date(2021, 4, 26),
 datetime.date(2021, 4, 27),
 datetime.date(2021, 4, 28),
 datetime.date(2021, 4, 29)]

Upvotes: 1

Related Questions