Sait
Sait

Reputation: 19815

Calculating the averages iteratively and efficiently

I have a long list of integers:

my_list = [10,13,42,23,12,45,33,59,12]

I want to calculate the averages of first i numbers, for all i in 0<i<n.

I can basically do:

averages = [ sum(my_list[0:i]) * (1.0/i) for i in range(1,len(my_list)) ]

which gives me the correct results, but I think there should be a faster way of doing this since I can use the previous sums in the following calculations.

I guess there should be a faster solution with numpy, maybe?

Upvotes: 1

Views: 1464

Answers (3)

ShadowRanger
ShadowRanger

Reputation: 155418

If you're using Python 3.2 or higher, see itertools.accumulate (and itertools.islice if you're trying to get a running average for a subset of the inputs). For example, in your case (getting the running average for n - 1 values in the input):

import itertools
my_list = [10,13,42,23,12,45,33,59,12]
sums = itertools.islice(itertools.accumulate(my_list), len(my_list) - 1)
# If you didn't intend to omit the final value, it's just:
# sums = itertools.accumulate(my_list)
averages = [accum / i for i, accum in enumerate(sums, start=1)]

Upvotes: 4

Akavall
Akavall

Reputation: 86188

How about using numpy.cumsum

In [13]: import numpy as np

In [14]: my_list = [10,13,42,23,12,45,33,59,12]

In [15]: np.cumsum(my_list) / np.arange(1, len(my_list)+1, dtype=np.float)
Out[15]: 
array([ 10.        ,  11.5       ,  21.66666667,  22.        ,
        20.        ,  24.16666667,  25.42857143,  29.625     ,  27.66666667])

Upvotes: 6

metatoaster
metatoaster

Reputation: 18908

You can simply create a generator function that would also allow lazy evaluation or generation of the list in one go by casting the result into a list, and use the enumerate generator to generate the count of the current element over time.

>>> def average_series(items):
...     sum = 0
...     for i, item in enumerate(items, 1):
...         sum += item
...         yield float(sum) / i
... 
>>> list(average_series([10,13,42,23,12,45,33,59,12]))
[10.0, 11.5, 21.666666666666668, ..., 27.666666666666668]

You don't always have to rely on numpy for simple things like this.

Upvotes: 3

Related Questions