Ruby - how to generate random time intervals matching a total amount of hours?

Question

I am trying to write a simple script, where the input would be a start date, end date and a total amount of hours (150) and the script would generate a simple report containing random date-time intervals (with ideally weekdays) that would sum the entered amount of hours.

This is what I am trying to achieve:

Start: 2020-01-01
End: 2020-01-31
Total hours: 150

Report:
Jan 1, 2019, 08:02:20 – Jan 1, 2019, 08:55:00: sub time -> 52:40 (52 minutes 40 seconds)
Jan 1, 2019, 09:00:00 – Jan 1, 2019, 09:38:13: sub time -> 38:13 (38 minutes 13 seconds)
...
Jan 3, 2019, 13:15:00 – Jan 3, 2019, 14:45:13: sub time -> 01:30:13 (1 hour 30 minutes 13 seconds)
...

TOTAL TIME: 150 hours (or in minutes)

How do I generate time intervals where the total amount of minutes/hours would be equal to a given number of hours?

Cary Swoveland · Accepted Answer

I assume the question is loosely-worded in the sense that "random" is not meant in a probability sense; that is, the intent is not to select a set of intervals (that total a given number of hours in length) with a mechanism that ensures all possible sets of such intervals have an equal likelihood of being selected. Rather, I understand that a set of intervals is to be chosen (e.g., for testing purposes) in a way that incorporates elements of randomness.

I have assumed the intervals are to be non-overlapping and the number of intervals is to be specified. I don't understand what "with ideally weekdays" means so I have disregarded that.

The heart of the approach I will propose is the following method.

def rnd_lengths(tot_secs, target_nbr)      
  max_secs = 2 * tot_secs/target_nbr - 1
  arr = []
  loop do
    break(arr) if tot_secs.zero?
    l = [(0.5 + max_secs * rand).round, tot_secs].min
    arr << l
    tot_secs -= l
  end
end

The method generates an array of integers (lengths of intervals), measured in seconds, ideally having target_nbr elements. tot_secs is the required combined length of the "random" intervals (e.g., 150*3600).

Each element of the array is drawn randomly drawn from a uniform distribution that ranges from zero to max_secs (to be computed). This is done sequentially until tot_secs is reached. Should the last random value cause the total to exceed tot_secs it is reduced to make the total equal tot_secs.`

Suppose tot_secs equals 100 and we wish to generate 4 random intervals (target_nbr = 4). That means the average length of the intervals would be 25. As we are using a uniform distribution having an average of (1 + max_secs)/2, we may derive the value of max_secs from the expression

target_nbr * (1 + max_secs)/2 = tot_secs

which is

max_secs = 2 * tot_secs/target_nbr - 1

the first line of the method. For the example I mentioned, this would be

max_secs = 2 * 100/4 - 1
  #=> 49

Let's try it.

rnd_lengths(100, 4)
  #=> [49, 36, 15]

As you see the array that is returned sums to 100, as required, but it contains only 3 elements. That's why I named the argument target_nbr, as there is no assurance the array returned will have that number of elements. What to do? Try again!

rnd_lengths(100, 4)
  #=> [14, 17, 26, 37, 6]

Still not 4 elements, so keep trying:

rnd_lengths(100, 4)
  #=> [11, 37, 39, 13]

Success! It may take a few tries to get the correct number of elements, but for parameters likely to be used, and the nature of the probability distribution employed, I wouldn't expect that to be a problem.

Let's put this in a method.

def rdm_intervals(tot_secs, nbr_intervals)
  loop do
    arr = rnd_lengths(tot_secs, nbr_intervals) 
    break(arr) if arr.size == nbr_intervals
  end
end

intervals = rdm_intervals(100, 4)
  #=> [29, 26, 7, 38]

We can compute random gaps between intervals in the same way. Suppose the intervals fall within a range of 175 seconds (the number of seconds between the start time and end time). Then:

gaps = rdm_intervals(175-100, 5)
  #=> [26, 5, 19, 4, 21]

As seen, the gaps sum to 75, as required. We can disregard the last element.

We can now form the intervals. The first interval begins at 26 seconds and ends at 26+29 #=> 55 seconds. The second interval begins at 55+5 #=> 60 seconds and ends at 60+26 #=> 86 seconds, and so on. We therefore find the intervals (each in ranges of seconds from zero) to be:

[26..55, 60..86, 105..112, 116..154]

Note that 175 - 154 = 21, the last element of gaps.

If one is uncomfortable with the fact that the last elements of intervals and gaps that are generally constrained in size one could of course randomly reposition those elements within their respective arrays.

One might not care if the number of intervals is exactly target_nbr. It would be simpler and faster to just use the first array of interval lengths produced. That's fine, but we still need the above methods to compute the random gaps, as their number must equal the number of intervals plus one:

gaps = rdm_intervals(175-100, intervals.size + 1)

We can now use these two methods to construct a method that will return the desired result. The argument tot_secs of this method equals total number of seconds spanned by the array intervals returned (e.g., 3600 * 150). The method returns an array containing nbr_intervals non-overlapping ranges of Time objects that fall between the given start and end dates.

require 'date'

def construct_intervals(start_date_str, end_date_str, tot_secs, nbr_intervals)
  start_time = Date.strptime(start_date_str, '%Y-%m-%d').to_time
  secs_in_period = Date.strptime(end_date_str, '%Y-%m-%d').to_time - start_time
  intervals = rdm_intervals(tot_secs, nbr_intervals)
  gaps = rdm_intervals(secs_in_period - tot_secs, nbr_intervals+1)
  nbr_intervals.times.with_object([]) do |_,arr|
    start_time += gaps.shift
    end_time = start_time + intervals.shift
    arr << (start_time..end_time)
    start_time = end_time
  end
end

See Date::strptime.

Let's try an example.

start_date_str = '2020-01-01'
end_date_str   = '2020-01-31' 
tot_secs       = 3600*150
  #=> 540000

construct_intervals(start_date_str, end_date_str, tot_secs, 4)
  #=> [2020-01-06 18:05:04 -0800..2020-01-09 03:48:00 -0800,
  #    2020-01-09 06:44:16 -0800..2020-01-11 23:33:44 -0800,
  #    2020-01-20 20:30:21 -0800..2020-01-21 17:27:44 -0800,
  #    2020-01-27 19:08:38 -0800..2020-01-28 01:38:51 -0800]

construct_intervals(start_date_str, end_date_str, tot_secs, 8)
  #=> [2020-01-03 18:43:36 -0800..2020-01-04 10:49:14 -0800,
  #    2020-01-08 07:55:44 -0800..2020-01-08 08:17:18 -0800,
  #    2020-01-11 00:54:36 -0800..2020-01-11 23:00:53 -0800,
  #    2020-01-14 05:20:14 -0800..2020-01-14 22:48:45 -0800,
  #    2020-01-16 18:28:28 -0800..2020-01-17 22:50:24 -0800,
  #    2020-01-22 02:59:31 -0800..2020-01-22 22:33:08 -0800,
  #    2020-01-23 00:36:59 -0800..2020-01-24 12:15:37 -0800,
  #    2020-01-29 11:22:21 -0800..2020-01-29 21:46:10 -0800]

See Date::strptime

Ruby - how to generate random time intervals matching a total amount of hours?

Answers (2)

Related Questions