PyRsquared
PyRsquared

Reputation: 7338

How does Pandas compute exponential moving averages under the hood?

I am trying to compare pandas EMA performance to numba performance.

Generally, I don't write functions if they are already in-built with pandas, as pandas will always be faster than my slow hand-coded python functions; for example quantile, sort values etc. I believe this is because much of pandas is coded in C under the hood, as well as pandas .apply() methods being much faster than explicit python for loops due to vectorization (but I'm open to an explanation if this is not true). But here, for computing EMA's, I have found that using numba far outperforms pandas.

The EMA I have coded is defined by

S_t = Y_1, t = 1

S_t = alpha*Y_t + (1 - alpha)*S_{t-1}, t > 1

where Y_t is the value of the time series at time t, S_t is the value of the moving average at time t, and alpha is the smoothing parameter.

The code is as follows

from numba import jit
import pandas as pd
import numpy as np

@jit
def ewm(arr, alpha):
    """
    Calculate the EMA of an array arr
    :param arr: numpy array of floats
    :param alpha: float between 0 and 1
    :return: numpy array of floats
    """
    # initialise ewm_arr
    ewm_arr = np.zeros_like(arr)
    ewm_arr[0] = arr[0]
    for t in range(1,arr.shape[0]):
        ewm_arr[t] = alpha*arr[t] + (1 - alpha)*ewm_arr[t-1]

    return ewm_arr

# initialize array and dataframe randomly
a = np.random.random(10000)
df = pd.DataFrame(a)

%timeit df.ewm(com=0.5, adjust=False).mean()
>>> 1000 loops, best of 3: 1.77 ms per loop

%timeit ewm(a, 0.5)
>>> 10000 loops, best of 3: 34.8 µs per loop

We see that the hand the hand coded ewm function is around 50 times faster than the pandas ewm method.

It may be the case that numba also outperforms various other pandas methods depending how one codes their function. But here I am interested in how numba outperforms pandas in calculating Exponential Moving Averages. What is pandas doing (not doing) that makes it slow - or is it that numba is just extremely fast in this case? How does pandas compute EMA's under the hood?

Upvotes: 1

Views: 2672

Answers (1)

Brad Solomon
Brad Solomon

Reputation: 40918

But here I am interested in how numba outperforms Pandas in calculating exponential moving averages.

Your version appears to be faster solely because you're passing it a NumPy array rather than a Pandas data structure:

>>> s = pd.Series(np.random.random(10000))

>>> %timeit ewm(s, alpha=0.5)
82 ms ± 10.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit ewm(s.values, alpha=0.5)
26 µs ± 193 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit s.ewm(alpha=0.5).mean()
852 µs ± 5.44 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In general, comparing NumPy versus Pandas operations is apples-to-oranges. The latter is built on top of the former and will almost always trade speed for flexibility. (But, taking that into consideration, Pandas is still fast and has come to rely more heavily on Cython ops over time.) I'm not sure specifically what it is about numba/jit that behaves better with NumPy. But if you compare both functions using a Pandas Series, Pandas itself comes out faster.

How does Pandas compute EMAs under the hood?

When you call df.ewm() (without yet calling the methods such .mean() or .cov()), the intermediate result is a bona fide class EWM that's found in pandas/core/window.py.

>>> ewm = pd.DataFrame().ewm(alpha=0.1)
>>> type(ewm)
<class 'pandas.core.window.EWM'>

Whether you pass com, span, halflife, or alpha, Pandas will map this back to a com and use that.

When you call the method itself, such as ewm.mean(), this maps to ._apply(), which in this case serves as a router to the appropriate Cython function:

cfunc = getattr(_window, func, None)

In the case of .mean(), func is "ewma". _window is the Cython module pandas/libs/window.pyx.

That brings you to the heart of things, at the function ewma(), which is where the bulk of the work takes place:

weighted_avg = ((old_wt * weighted_avg) +
                (new_wt * cur)) / (old_wt + new_wt)

If you'd like a fairer comparison, call this function directly with the underlying NumPy values:

>>> from pandas._libs.window import ewma                                                                                                                 
>>> %timeit ewma(s.values, 0.4, 0, 0, 0)                                                                                                                 
513 µs ± 10.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

(Remember, it takes only a com; for that, you can use pandas.core.window._get_center_of_mass().

Upvotes: 1

Related Questions