Reputation: 19815
I have a long list of integers:
my_list = [10,13,42,23,12,45,33,59,12]
I want to calculate the averages of first i
numbers, for all i
in 0<i<n
.
I can basically do:
averages = [ sum(my_list[0:i]) * (1.0/i) for i in range(1,len(my_list)) ]
which gives me the correct results, but I think there should be a faster way of doing this since I can use the previous sums in the following calculations.
I guess there should be a faster solution with numpy
, maybe?
Upvotes: 1
Views: 1464
Reputation: 155418
If you're using Python 3.2 or higher, see itertools.accumulate (and itertools.islice
if you're trying to get a running average for a subset of the inputs). For example, in your case (getting the running average for n - 1
values in the input):
import itertools
my_list = [10,13,42,23,12,45,33,59,12]
sums = itertools.islice(itertools.accumulate(my_list), len(my_list) - 1)
# If you didn't intend to omit the final value, it's just:
# sums = itertools.accumulate(my_list)
averages = [accum / i for i, accum in enumerate(sums, start=1)]
Upvotes: 4
Reputation: 86188
How about using numpy.cumsum
In [13]: import numpy as np
In [14]: my_list = [10,13,42,23,12,45,33,59,12]
In [15]: np.cumsum(my_list) / np.arange(1, len(my_list)+1, dtype=np.float)
Out[15]:
array([ 10. , 11.5 , 21.66666667, 22. ,
20. , 24.16666667, 25.42857143, 29.625 , 27.66666667])
Upvotes: 6
Reputation: 18908
You can simply create a generator function that would also allow lazy evaluation or generation of the list in one go by casting the result into a list
, and use the enumerate
generator to generate the count of the current element over time.
>>> def average_series(items):
... sum = 0
... for i, item in enumerate(items, 1):
... sum += item
... yield float(sum) / i
...
>>> list(average_series([10,13,42,23,12,45,33,59,12]))
[10.0, 11.5, 21.666666666666668, ..., 27.666666666666668]
You don't always have to rely on numpy for simple things like this.
Upvotes: 3