Simulating event system using pandas

Question

I would like to simulate a system comprising:

one emitter that produces n events randomly spaced within [0, 1000) ms
one consumer that samples the event stream uniformly spaced at intervals of f ms
one consumer that samples the event stream uniformly spaced at intervals of g ms

For example, with f sampling at 5ms and g sampling at 10ms, the sequence might look like:

0ms: f samples, g samples
5ms: f samples
10ms: f samples, g samples
12ms: event arrives
13ms: event arrives
15ms: f samples
20ms: f samples, g samples
25ms f samples
27ms: event arrives
30ms f samples, g samples
35ms f samples
37ms: event arrives

For each emission, the consumer which samples closest to (but after) the event time "wins". In the event of a tie the winner should be chosen randomly. For example f's sample at 15ms wins both the 12ms and the 13ms event.

I've attempted to implement this by merging the timelines onto one index:

import numpy as np
import pandas as pd

f = np.arange(0, 40, 5)
g = np.arange(0, 40, 10)
events = [12, 13, 27, 37]

df = pd.concat([pd.Series(f, f), pd.Series(g, g), pd.Series(events, events)], axis=1)

Which yields a DataFrame like this:

     f   g  events
0    0   0     NaN
5    5 NaN     NaN
10  10  10     NaN
12 NaN NaN      12
13 NaN NaN      13
15  15 NaN     NaN
20  20  20     NaN
25  25 NaN     NaN
27 NaN NaN      27
30  30  30     NaN
35  35 NaN     NaN
37 NaN NaN      37

I've been noodling around trying to find the winners with various operations against the following roll-up:

In [103]: pd.expanding_max(df)
     f   g  events
0    0   0     NaN
5    5   0     NaN
10  10  10     NaN
12  10  10      12
13  10  10      13
15  15  10      13
20  20  20      13
25  25  20      13
27  25  20      27
30  30  30      27
35  35  30      27
37  35  30      37

...but have been having a hard time finding a pandas-ish way to do it.

I feel pretty close with the following:

In [141]: x = pd.expanding_min(df.sort(ascending=False))
          gx = x.groupby('events')
          print gx.max()
events        
12      15  20
13      15  20
27      30  30
37      35  30

Any ideas?

HYRY · Accepted Answer

Use bfill to fill NaN's backward in "f" & "g" columns:

import numpy as np
import pandas as pd

f = np.arange(0, 40, 5)
g = np.arange(0, 40, 10)
events = [12, 13, 27, 37]

df = pd.concat([pd.Series(f, f), pd.Series(g, g), pd.Series(events, events)], axis=1)
df.columns = "f", "g", "event"
df[["f", "g"]] = df[["f", "g"]].bfill()
df2 = df.dropna()
print df2

here is the output:

     f   g  event
12  15  20     12
13  15  20     13
27  30  30     27

Then we can compare f & g:

print np.sign(df2.f - df2.g).replace({-1:"f", 1:"g", 0:"fg"})

the output is:

12     f
13     f
27    fg
dtype: object

This means the events at 12 & 13 is taken by "f", and event at 27 should be chosen randomly.

Simulating event system using pandas

Answers (1)

Related Questions