user1228368
user1228368

Reputation: 117

most efficent way of finding the minimum float in a python list

Quick question, which is more efficient for finding the smallest number (float) in a long list (10000+ elements)

is it

min(mylist)

or

mylist.sort() 

and then returning

mylist[0]

or something else...

thanks!

Upvotes: 11

Views: 16834

Answers (3)

DSM
DSM

Reputation: 353359

Frst, if you care about performance in Python (which isn't always a sensible thing to care about, but that's another conversation), you should be using the timeit module. Even in C it's hard to predict how certain functions will behave after compilation, and it's harder in Python. People are often confidently expressing opinions about which functions are faster which are data-dependent. Then -- by using timeit, I mean -- you could've found out yourself.

Second, if you really care about performance on lists of floats, you shouldn't be using lists at all, but numpy arrays. Using IPython here, under Python 2.7.2, which makes timing things easy:

In [41]: import random, numpy
In [42]: a = [0.1*i for i in range(10**5)]
In [43]: timeit min(a)
100 loops, best of 3: 4.55 ms per loop
In [44]: timeit sorted(a)[0]
100 loops, best of 3: 4.57 ms per loop
In [45]: random.shuffle(a)
In [46]: timeit min(a)
100 loops, best of 3: 6.06 ms per loop
In [47]: timeit min(a) # to make sure it wasn't a fluke
100 loops, best of 3: 6.07 ms per loop
In [48]: timeit sorted(a)[0]
10 loops, best of 3: 65.9 ms per loop
In [49]: b = numpy.array(a)
In [50]: timeit b.min()
10000 loops, best of 3: 97.5 us per loop

And we note a few things. (1) Python's sort (timsort) works very well on data which has sorted runs, so sorting an already sorted list has almost no penalty. (2) Sorting a random list, on the other hand, is very much slower, and this will only get worse as the data gets larger. (3) Numpy.min() on a float array works sixty times faster than min on a Python list, because it doesn't have to be as general.

Upvotes: 13

Adam Matan
Adam Matan

Reputation: 136351

If the list is already populated, min() is the most efficient way.

There are some tricks you might use in special scenarios:

  • If you build the list from scratch, simply keep the smallest item yet in an external variable, so that the answer will be given in O(1).
  • If there are only Floats in the list, use an Array which gives better performance.
  • You can keep the list sorted using bisect.
  • Use a Python Heap, which even has an efficient implementation of min(). Make sure you understand the effects, mainly a slower insertion. (credit: interjay)

Upvotes: 12

Tia
Tia

Reputation: 2081

Well, you definitely have to iterate over the whole list, so your runtime should be O(n).

As a sort already takes O(n log n) time, this is obviously not the best way to do this. I do not know the implementation of min(), but it should be the right way to do this.

Upvotes: 2

Related Questions