Reputation: 42329
I have a list with several very big values set on purpose to differentiate those indexes, it looks like this:
a = [1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.]
I need to find the second largest value in that list which isn't equal to 9999.
(in the case above it would be 7.4
) in the most efficient way possible (my list can get quite big)
In this question Retrieve the two highest item from a list containing 100,000 integers the heapq.nlargest
function is mentioned but since I have more than one value 9999.
it wouldn't work.
Upvotes: 1
Views: 2028
Reputation: 1021
If you want to use numpy you can use masked arrays to skip 'bad' values:
import numpy as np
a = np.array([1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.])
ma = np.ma.masked_values(a, 9999., copy=False)
ma.max()
7.4
you can easily add exclusions to your mask:
ma = np.ma.masked_values(ma, 7.4, copy=False)
ma.max()
7.0
ma.mask[ma>=5]=True
ma.max()
3.7
Upvotes: 0
Reputation:
Here is an alternate method:
>>> a = [1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.]
>>> sorted(set(a))[-2]
7.4
>>>
And, believe it or not, it is actually quite a lot faster than the accepted solution:
>>> from timeit import timeit
>>> timeit("a=range(10000000);print sorted(set(a))[-2]", number=10)
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
34.327036257401424
>>> # This is NPE's answer
>>> timeit("a=range(10000000);maxa = max(a);print max(val for val in a if val != maxa)", number=10)
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
53.22811809880869
>>>
Above is a test that runs 10 times and works with a list that contains 10,000,000 items. Unless there is a flaw in my test (which I don't think there is), the solution I gave is clearly much faster.
Upvotes: 5
Reputation: 239443
a = set([1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.])
a.remove(max(a))
print max(a)
This uses set
to make sure that we deal with only unique items and then we remove the maximum value, so that next time when we call max
, we ll be left with the second best maximum number.
Upvotes: 2
Reputation: 500257
>>> max(val for val in a if val != 9999)
7.4
This has O(n)
time complexity.
If the 9999
isn't fixed, you can generalize this by using max(a)
instead of 9999
:
>>> maxa = max(a)
>>> max(val for val in a if val != maxa)
7.4
(Although I suspect this isn't what you want.)
Upvotes: 3