Reputation: 1301
I want to write this code as pythonic. My real array much bigger than this example.
( 5+10+20+3+2 ) / 5
print(np.mean(array,key=lambda x:x[1])) TypeError: mean() got an unexpected keyword argument 'key'
array = [('a', 5) , ('b', 10), ('c', 20), ('d', 3), ('e', 2)]
sum = 0
for i in range(len(array)):
sum = sum + array[i][1]
average = sum / len(array)
print(average)
import numpy as np
print(np.mean(array,key=lambda x:x[1]))
How can avoid this? I want to use second example.
I'm using Python 3.7
Upvotes: 24
Views: 4664
Reputation: 24163
If you are using Python 3.4 or above, you could use the statistics
module:
from statistics import mean
average = mean(value[1] for value in array)
Or if you're using a version of Python older than 3.4:
average = sum(value[1] for value in array) / len(array)
These solutions both use a nice feature of Python called a generator expression. The loop
value[1] for value in array
creates a new sequence in a timely and memory efficient manner. See PEP 289 -- Generator Expressions.
If you're using Python 2, and you're summing integers, we will have integer division, which will truncate the result, e.g:
>>> 25 / 4
6
>>> 25 / float(4)
6.25
To ensure we don't have integer division we could set the starting value of sum
to be the float
value 0.0
. However, this also means we have to make the generator expression explicit with parentheses, otherwise it's a syntax error, and it's less pretty, as noted in the comments:
average = sum((value[1] for value in array), 0.0) / len(array)
It's probably best to use fsum
from the math
module which will return a float
:
from math import fsum
average = fsum(value[1] for value in array) / len(array)
Upvotes: 28
Reputation: 31
If you're open to more golf-like solutions, you can transpose your array with vanilla python, get a list of just the numbers, and calculate the mean with
sum(zip(*array)[1])/len(array)
Upvotes: 3
Reputation: 7206
If you do want to use numpy
, cast it to a numpy.array
and select the axis you want using numpy
indexing:
import numpy as np
array = np.array([('a', 5) , ('b', 10), ('c', 20), ('d', 3), ('e', 2)])
print(array[:,1].astype(float).mean())
# 8.0
The cast to a numeric type is needed because the original array contains both strings and numbers and is therefore of type object
. In this case you could use float
or int
, it makes no difference.
Upvotes: 3
Reputation: 24107
You can simply use:
print(sum(tup[1] for tup in array) / len(array))
Or for Python 2:
print(sum(tup[1] for tup in array) / float(len(array)))
Or little bit more concisely for Python 2:
from math import fsum
print(fsum(tup[1] for tup in array) / len(array))
Upvotes: 2
Reputation: 512
you can use map
instead of list comprehension
sum(map(lambda x:int(x[1]), array)) / len(array)
or functools.reduce
(if you use Python2.X just reduce
not functools.reduce
)
import functools
functools.reduce(lambda acc, y: acc + y[1], array, 0) / len(array)
Upvotes: 2
Reputation: 20500
Just find the average using sum and number of elements of the list.
array = [('a', 5) , ('b', 10), ('c', 20), ('d', 3), ('e', 2)]
avg = float(sum(value[1] for value in array)) / float(len(array))
print(avg)
#8.0
Upvotes: 0
Reputation: 19885
With pure Python:
from operator import itemgetter
acc = 0
count = 0
for value in map(itemgetter(1), array):
acc += value
count += 1
mean = acc / count
An iterative approach can be preferable if your data cannot fit in memory as a list
(since you said it was big). If it can, prefer a declarative approach:
data = [sub[1] for sub in array]
mean = sum(data) / len(data)
If you are open to using numpy
, I find this cleaner:
a = np.array(array)
mean = a[:, 1].astype(int).mean()
Upvotes: 2