Reputation: 299
Improving upon this question which provided a clever solution for applying a function over multiple columns in a DataFrame, I'm wondering if the solution can be further optimized for speed.
Environment: Python 2.7.8, Pandas 14.1, Numpy 1.8.
Here's the example setup:
import pandas as pd
import numpy as np
import random
def meanmax(ii,df):
xdf = df.iloc[map(int,ii)]
n = max(xdf['A']) + max(xdf['B'])
return n / 2.0
df = pd.DataFrame(np.random.randn(2500,2)/10000,
index=pd.date_range('2001-01-01',periods=2500),
columns=['A','B'])
df['ii'] = range(len(df))
res = pd.rolling_apply(df.ii, 26, lambda x: meanmax(x, df))
Note that the meanmax
function is not pairwise, thus something like rolling_mean(df['A'] + df['B'],26)
won't work.
However I can do something like:
res2 = (pd.rolling_max(df['A'],26) + pd.rolling_max(df['B'],26)) / 2
Which completes roughly 3000x faster:
%timeit res = pd.rolling_apply(df.ii, 26, lambda x: meanmax(x, df))
1 loops, best of 3: 1 s per loop
%timeit res2 = (pd.rolling_max(df['A'],26) + pd.rolling_max(df['B'],26)) / 2
1000 loops, best of 3: 325 µs per loop
Is there anything better/equivalent than the second option above, given the example function and using rolling_apply
? While the second option is faster, it doesn't use a rolling_apply
, which can be applied to a wider problem set
Edit: Performance timing correction
Upvotes: 3
Views: 5786
Reputation: 353059
You won't be able to get down to rolling_max
speed, but you can often shave off an order of magnitude or so by dropping down to numpy
via .values
:
def meanmax_np(ii, df):
ii = ii.astype(int)
n = df["A"].values[ii].max() + df["B"].values[ii].max()
return n/2.0
gives me
>>> %timeit res = pd.rolling_apply(df.ii, 26, lambda x: meanmax(x, df))
1 loops, best of 3: 701 ms per loop
>>> %timeit res_np = pd.rolling_apply(df.ii, 26, lambda x: meanmax_np(x, df))
10 loops, best of 3: 31.2 ms per loop
>>> %timeit res2 = (pd.rolling_max(df['A'],26) + pd.rolling_max(df['B'],26)) / 2
1000 loops, best of 3: 247 µs per loop
which though still 100x slower than the optimized case is much faster than the original. Sometimes when I only need something to be ten times faster for it not to be the dominant timesink that suffices.
Upvotes: 3
Reputation: 67427
Computing a generic rolling function over an array of size n
with a window of size m
requires roughly O(n*m)
time. The built in rollin_xxx
methods use some pretty smart algorithms to keep running time well below that, and can often guarantee O(n)
time, which if you think of it is a pretty impressive thing.
rolling_min
and rolling_max
in particular borrowed their implementation from bottleneck, which cites Richard Harter as the source of the algorithm, although I found what I think is an earlier description of the same algorithm in this paper.
So after the history lesson: it is very likely that you will not be able to have your cake an eat it. rolling_apply
is very convenient, but it is almost always going to sacrifice performance against a specific algorithm. In my experience, one of the more enjoyable parts of using the Python scientific stack is coming up with efficient ways of doing computations, using the fast primitives provided in creative ways. Your own solution calling rolling_max
twice is a good example of this. So relax and enjoy the ride, knowing that you will always have rolling_apply
to fall back on if you, or the good people of SO, cannot come with a smarter solution.
Upvotes: 6