Reputation: 9269
If I have a list of numbers or objects in a list like l = [3,5,3,6,47,89]. We can calculate the minimum, maximum and average using following python code
minimum = min(l)
maximum = max(l)
avg = sum(l) / len(l)
Since all involve iterating the entire list, it is slow for large lists and lot of code.Is there any python module which can calculate all these values together?
Upvotes: 2
Views: 2530
Reputation: 13539
Cython function:
@cython.boundscheck(False)
@cython.wraparound(False)
def minmaxAvg(list x):
cdef int i
cdef int _min, _max, total
_min = x[0]
_max = x[0]
total = 0
for i in x:
if i < _min: _min = i
elif i > _max: _max = i
total += i
return _min, _max, total/len(x)
pure python function to compare against:
def builtinfuncs(x):
a = min(x)
b = max(x)
avg = sum(x) / len(x)
return a,b,avg
In [16]: x = [random.randint(0,1000) for _ in range(10000)]
In [17]: %timeit minmaxAvg(x)
10000 loops, best of 3: 34 µs per loop
In [18]: %timeit frob(x)
1000 loops, best of 3: 460 µs per loop
Disclaimer:
- Speed result from cython will be dependent on computer hardware.
- Not as flexible and foolproof as using builtins. You would have to change the function to handle anything but integers for example.
- Before going down this path, you should ask yourself if this operation really is a big bottleneck in your application. It's probably not.
Upvotes: 3
Reputation: 68156
If you have pandas installed, you can do something like this:
import numpy as np
import pandas
s = pandas.Series(np.random.normal(size=37))
stats = s.describe()
stats
will be a another series that behaves like a dictionary:
print(stats)
count 37.000000
mean 0.072138
std 0.932000
min -1.267888
25% -0.688728
50% -0.048624
75% 0.784244
max 2.501713
dtype: float64
stats['max']
2.501713
...etc. However, I don't recommend this unless you're striving simply for concise code. Here's why:
%%timeit
stats = s.describe()
# 100 loops, best of 3: 1.44 ms per loop
%%timeit
mymin = min(s)
mymax = max(s)
myavg = sum(s)/len(s)
# 10000 loops, best of 3: 89.5 µs per loop
I just can't imagine that you'll be able to squeeze any more performance out of the built-in functions with your own implementations (barring some cython voodoo, maybe).
Upvotes: 3