Reputation: 1133
I have several time points taken from a video with some max time length (T
). These points are stored in a list of lists as follows:
time_pt_nested_list =
[[0.0, 6.131, 32.892, 43.424, 46.969, 108.493, 142.69, 197.025, 205.793, 244.582, 248.913, 251.518, 258.798, 264.021, 330.02, 428.965],
[11.066, 35.73, 64.784, 151.31, 289.03, 306.285, 328.7, 408.274, 413.64],
[48.447, 229.74, 293.19, 333.343, 404.194, 418.575],
[66.37, 242.16, 356.96, 424.967],
[78.711, 358.789, 403.346],
[84.454, 373.593, 422.384],
[102.734, 394.58],
[158.534],
[210.112],
[247.61],
[340.02],
[365.146],
[372.153]]
Each list above is associated with some probability; I'd like to randomly select points from each list according to its probability to form n
tuples of contiguous time spans, such as the following:
[(0,t1),(t1,t2),(t2,t3),...,(tn,T)]
where n
is specified by the user. All the returned tuples should only contain the floating point numbers inside the nested list above. I want to assign the highest probability to them to be sampled and appear in the returned tuples, the second list a slightly lower probability, etc. The exact details of these probabilities are not important, but it would be nice if the user can input a parameter that controls how fast the probability decays when idx
increases.
The returned tuples are timeframes that should exactly cover the entire video and should not overlap. 0
and T
may not necessarily appear in time_pt_nested_list
(but they may). Are there nice ways to implement this? I would be grateful for any insightful suggestions.
For example if the user inputs 6 as the number of subclips, then this will be an example output:
[(0.0, 32.892), (32.892, 64.784), (64.784, 229.74), (229.74, 306.285), (306.285, 418.575), (418.575, 437.47)]
All numbers appearing in the tuples appeared in time_pt_nested_list
, except 0.0
and 437.47
. (Well 0.0
does appear here but may not in other cases) Here 437.47
is the length of video which is also given and may not appear in the list.
Upvotes: 1
Views: 59
Reputation: 11171
This is simpler than it may look. You really just need to sample n
points from your sublists, each with row-dependent sample probability. Whatever samples are obtained can be time-ordered to construct your tuples.
import numpy as np
# user params
n = 6
prob_falloff_param = 0.2
lin_list = sorted([(idx, el) for idx, row in enumerate(time_pt_nested_list) for
el in row], key=lambda x: x[1])
# endpoints required, excluded from random selection process
t0 = lin_list.pop(0)[1]
T = lin_list.pop(-1)[1]
arr = np.array(lin_list)
# define row weights, alpha is parameter
weights = np.exp(-prob_falloff_param*arr[:,0]**2)
norm_weights = weights/np.sum(weights)
# choose (weighted) random points, create tuple list:
random_points = sorted(np.random.choice(arr[:,1], size=(n-1), replace=False))
time_arr = [t0, *random_points, T]
output = list(zip(time_arr, time_arr[1:]))
example outputs:
# n = 6
[(0.0, 78.711),
(78.711, 84.454),
(84.454, 158.534),
(158.534, 210.112),
(210.112, 372.153),
(372.153, 428.965)]
# n = 12
[(0.0, 6.131),
(6.131, 43.424),
(43.424, 64.784),
(64.784, 84.454),
(84.454, 102.734),
(102.734, 210.112),
(210.112, 229.74),
(229.74, 244.582),
(244.582, 264.021),
(264.021, 372.153),
(372.153, 424.967),
(424.967, 428.965)]
Upvotes: 1