Reputation: 117
Quick question, which is more efficient for finding the smallest number (float) in a long list (10000+ elements)
is it
min(mylist)
or
mylist.sort()
and then returning
mylist[0]
or something else...
thanks!
Upvotes: 11
Views: 16834
Reputation: 353359
Frst, if you care about performance in Python (which isn't always a sensible thing to care about, but that's another conversation), you should be using the timeit module. Even in C it's hard to predict how certain functions will behave after compilation, and it's harder in Python. People are often confidently expressing opinions about which functions are faster which are data-dependent. Then -- by using timeit, I mean -- you could've found out yourself.
Second, if you really care about performance on lists of floats, you shouldn't be using lists at all, but numpy arrays. Using IPython here, under Python 2.7.2, which makes timing things easy:
In [41]: import random, numpy
In [42]: a = [0.1*i for i in range(10**5)]
In [43]: timeit min(a)
100 loops, best of 3: 4.55 ms per loop
In [44]: timeit sorted(a)[0]
100 loops, best of 3: 4.57 ms per loop
In [45]: random.shuffle(a)
In [46]: timeit min(a)
100 loops, best of 3: 6.06 ms per loop
In [47]: timeit min(a) # to make sure it wasn't a fluke
100 loops, best of 3: 6.07 ms per loop
In [48]: timeit sorted(a)[0]
10 loops, best of 3: 65.9 ms per loop
In [49]: b = numpy.array(a)
In [50]: timeit b.min()
10000 loops, best of 3: 97.5 us per loop
And we note a few things. (1) Python's sort (timsort) works very well on data which has sorted runs, so sorting an already sorted list has almost no penalty. (2) Sorting a random list, on the other hand, is very much slower, and this will only get worse as the data gets larger. (3) Numpy.min() on a float array works sixty times faster than min on a Python list, because it doesn't have to be as general.
Upvotes: 13
Reputation: 136351
If the list is already populated, min()
is the most efficient way.
There are some tricks you might use in special scenarios:
O(1)
.min()
. Make sure you understand the effects, mainly a slower insertion. (credit: interjay)Upvotes: 12
Reputation: 2081
Well, you definitely have to iterate over the whole list, so your runtime should be O(n).
As a sort already takes O(n log n) time, this is obviously not the best way to do this. I do not know the implementation of min(), but it should be the right way to do this.
Upvotes: 2