Reputation: 311
Is there a simple way to calculate the mean of several (same length) lists in Python? Say, I have [[1, 2, 3], [5, 6, 7]]
, and want to obtain [3,4,5]
. This is to be doing 100000 times, so want it to be fast.
Upvotes: 17
Views: 34539
Reputation: 11
Slightly modified version for smooth work with RGB pixels:
def average(*l):
l=tuple(l)
def divide(x): return x // len(l)
return list(map(divide, map(sum, zip(*l))))
print(average([0,20,200],[100,40,100]))
>>> [50,30,150]
Upvotes: 0
Reputation: 129497
In case you're using numpy
(which seems to be more appropriate here):
>>> import numpy as np
>>> data = np.array([[1, 2, 3], [5, 6, 7]])
>>> np.average(data, axis=0)
array([ 3., 4., 5.])
Upvotes: 32
Reputation: 5149
Extending NPEs answer, for a list containing n
sublists which you want to average, use this (a numpy solution might be faster, but mine uses only built-ins):
def average(l):
llen = len(l)
def divide(x): return x / llen
return map(divide, map(sum, zip(*l)))
This sums up all sublists and then divides the result by the number of sublists, producing the average. You could inline the len
computation and turn divide
into a lambda like lambda x: x / len(l)
, but using an explicit function and pre-computing the length should be a bit faster.
Upvotes: 2
Reputation: 500207
In [6]: l = [[1, 2, 3], [5, 6, 7]]
In [7]: [(x+y)/2 for x,y in zip(*l)]
Out[7]: [3, 4, 5]
(You'll need to decide whether you want integer or floating-point maths, and which kind of division to use.)
On my computer, the above takes 1.24us:
In [11]: %timeit [(x+y)/2 for x,y in zip(*l)]
1000000 loops, best of 3: 1.24 us per loop
Thus processing 100,000 inputs would take 0.124s.
Interestingly, NumPy arrays are slower on such small inputs:
In [27]: In [21]: a = np.array(l)
In [28]: %timeit (a[0] + a[1]) / 2
100000 loops, best of 3: 5.3 us per loop
In [29]: %timeit np.average(a, axis=0)
100000 loops, best of 3: 12.7 us per loop
If the inputs get bigger, the relative timings will no doubt change.
Upvotes: 6