Reputation: 1265
I would like to simulate a system comprising:
For example, with f sampling at 5ms and g sampling at 10ms, the sequence might look like:
0ms: f samples, g samples
5ms: f samples
10ms: f samples, g samples
12ms: event arrives
13ms: event arrives
15ms: f samples
20ms: f samples, g samples
25ms f samples
27ms: event arrives
30ms f samples, g samples
35ms f samples
37ms: event arrives
For each emission, the consumer which samples closest to (but after) the event time "wins". In the event of a tie the winner should be chosen randomly. For example f's sample at 15ms wins both the 12ms and the 13ms event.
I've attempted to implement this by merging the timelines onto one index:
import numpy as np
import pandas as pd
f = np.arange(0, 40, 5)
g = np.arange(0, 40, 10)
events = [12, 13, 27, 37]
df = pd.concat([pd.Series(f, f), pd.Series(g, g), pd.Series(events, events)], axis=1)
Which yields a DataFrame like this:
f g events
0 0 0 NaN
5 5 NaN NaN
10 10 10 NaN
12 NaN NaN 12
13 NaN NaN 13
15 15 NaN NaN
20 20 20 NaN
25 25 NaN NaN
27 NaN NaN 27
30 30 30 NaN
35 35 NaN NaN
37 NaN NaN 37
I've been noodling around trying to find the winners with various operations against the following roll-up:
In [103]: pd.expanding_max(df)
f g events
0 0 0 NaN
5 5 0 NaN
10 10 10 NaN
12 10 10 12
13 10 10 13
15 15 10 13
20 20 20 13
25 25 20 13
27 25 20 27
30 30 30 27
35 35 30 27
37 35 30 37
...but have been having a hard time finding a pandas-ish way to do it.
I feel pretty close with the following:
In [141]: x = pd.expanding_min(df.sort(ascending=False))
gx = x.groupby('events')
print gx.max()
events
12 15 20
13 15 20
27 30 30
37 35 30
Any ideas?
Upvotes: 3
Views: 157
Reputation: 97291
Use bfill
to fill NaN's backward in "f" & "g" columns:
import numpy as np
import pandas as pd
f = np.arange(0, 40, 5)
g = np.arange(0, 40, 10)
events = [12, 13, 27, 37]
df = pd.concat([pd.Series(f, f), pd.Series(g, g), pd.Series(events, events)], axis=1)
df.columns = "f", "g", "event"
df[["f", "g"]] = df[["f", "g"]].bfill()
df2 = df.dropna()
print df2
here is the output:
f g event
12 15 20 12
13 15 20 13
27 30 30 27
Then we can compare f & g:
print np.sign(df2.f - df2.g).replace({-1:"f", 1:"g", 0:"fg"})
the output is:
12 f
13 f
27 fg
dtype: object
This means the events at 12 & 13 is taken by "f", and event at 27 should be chosen randomly.
Upvotes: 2