norok2
norok2

Reputation: 26886

Slow `statistics` functions

Why is statistics.mean() so slow compared to the NumPy version or even to a naive implementation like e.g.:

def mean(items):
    return sum(items) / len(items)

On my system, I get the following timings:

import numpy as np
import statistics

ll_int = [x for x in range(100_000)]

%timeit statistics.mean(ll_int)
# 42 ms ± 408 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit sum(ll_int) / len(ll_int)
# 460 µs ± 5.43 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit np.mean(ll_int)
# 4.62 ms ± 38.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


ll_float = [x / 10 for x in range(100_000)]

%timeit statistics.mean(ll_float)
# 56.7 ms ± 879 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit sum(ll_float) / len(ll_float)
# 459 µs ± 7.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit np.mean(ll_float)
# 2.7 ms ± 11.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

I get similar timings for other functions like variance or stdev.

EDIT: Even an iterative implementation like this:

def next_mean(value, mean_, num):
    return (num * mean_ + value) / (num + 1)

def imean(items, mean_=0.0):
    for i, item in enumerate(items):
        mean_ = next_mean(item, mean_, i)
    return mean_

seems to be faster:

%timeit imean(ll_int)
# 16.6 ms ± 62.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit imean(ll_float)
# 16.2 ms ± 429 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Upvotes: 2

Views: 123

Answers (2)

snakecharmerb
snakecharmerb

Reputation: 55600

The developer of the statistics module made an explicit decision to value correctness over speed:

Correctness over speed. It is easier to speed up a correct but slow function than to correct a fast but buggy one.

and moreover stated that there was no intention to

to replace, or even compete directly with, numpy

However, a enhancement request was raised to add an additional, faster, simpler implementation, statistics.fmean, and this function will be released in Python 3.8. According to the enhancement developer this function is up to 500 times faster than the existing statistics.mean.

The fmean implementation is pretty much sum/len.

Upvotes: 2

John Coleman
John Coleman

Reputation: 51988

The statistics module uses interpreted Python code, but numpy is using optimized compiled code for all of its heavy lifting, so it would be surprising if numpy didn't blow statistics out of the water.

Furthermore, statistics is designed to play nice with modules like decimal and fractions and uses code which values numerical accuracy and type safety over speed. Your naive implementation uses sum. The statistics module uses its own function called _sum internally. Looking at its source shows that it does an awful lot more than just add things together:

def _sum(data, start=0):
    """_sum(data [, start]) -> (type, sum, count)
    Return a high-precision sum of the given numeric data as a fraction,
    together with the type to be converted to and the count of items.
    If optional argument ``start`` is given, it is added to the total.
    If ``data`` is empty, ``start`` (defaulting to 0) is returned.
    Examples
    --------
    >>> _sum([3, 2.25, 4.5, -0.5, 1.0], 0.75)
    (<class 'float'>, Fraction(11, 1), 5)
    Some sources of round-off error will be avoided:
    # Built-in sum returns zero.
    >>> _sum([1e50, 1, -1e50] * 1000)
    (<class 'float'>, Fraction(1000, 1), 3000)
    Fractions and Decimals are also supported:
    >>> from fractions import Fraction as F
    >>> _sum([F(2, 3), F(7, 5), F(1, 4), F(5, 6)])
    (<class 'fractions.Fraction'>, Fraction(63, 20), 4)
    >>> from decimal import Decimal as D
    >>> data = [D("0.1375"), D("0.2108"), D("0.3061"), D("0.0419")]
    >>> _sum(data)
    (<class 'decimal.Decimal'>, Fraction(6963, 10000), 4)
    Mixed types are currently treated as an error, except that int is
    allowed.
    """
    count = 0
    n, d = _exact_ratio(start)
    partials = {d: n}
    partials_get = partials.get
    T = _coerce(int, type(start))
    for typ, values in groupby(data, type):
        T = _coerce(T, typ)  # or raise TypeError
        for n,d in map(_exact_ratio, values):
            count += 1
            partials[d] = partials_get(d, 0) + n
    if None in partials:
        # The sum will be a NAN or INF. We can ignore all the finite
        # partials, and just look at this special one.
        total = partials[None]
        assert not _isfinite(total)
    else:
        # Sum all the partial sums using builtin sum.
        # FIXME is this faster if we sum them in order of the denominator?
        total = sum(Fraction(n, d) for d, n in sorted(partials.items()))
return (T, total, count)

The most surprising thing about this code is that it converts the data to fractions so as to minimize round-off error. There is no reason to expect that code like this would be as quick as a simple sum(nums)/len(nums) approach.

Upvotes: 3

Related Questions