Gabriel
Gabriel

Reputation: 42329

Find second largest element in list with repeated elements

I have a list with several very big values set on purpose to differentiate those indexes, it looks like this:

a = [1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.]

I need to find the second largest value in that list which isn't equal to 9999. (in the case above it would be 7.4) in the most efficient way possible (my list can get quite big)

In this question Retrieve the two highest item from a list containing 100,000 integers the heapq.nlargest function is mentioned but since I have more than one value 9999. it wouldn't work.

Upvotes: 1

Views: 2028

Answers (4)

rtrwalker
rtrwalker

Reputation: 1021

If you want to use numpy you can use masked arrays to skip 'bad' values:

import numpy as np
a = np.array([1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.])
ma = np.ma.masked_values(a, 9999., copy=False)
ma.max()
7.4

you can easily add exclusions to your mask:

ma = np.ma.masked_values(ma, 7.4, copy=False)
ma.max()
7.0
ma.mask[ma>=5]=True   
ma.max()
3.7

Upvotes: 0

user2555451
user2555451

Reputation:

Here is an alternate method:

>>> a = [1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.]
>>> sorted(set(a))[-2]
7.4
>>>

And, believe it or not, it is actually quite a lot faster than the accepted solution:

>>> from timeit import timeit
>>> timeit("a=range(10000000);print sorted(set(a))[-2]", number=10)
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
34.327036257401424
>>> # This is NPE's answer
>>> timeit("a=range(10000000);maxa = max(a);print max(val for val in a if val != maxa)", number=10)
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
53.22811809880869
>>>

Above is a test that runs 10 times and works with a list that contains 10,000,000 items. Unless there is a flaw in my test (which I don't think there is), the solution I gave is clearly much faster.

Upvotes: 5

thefourtheye
thefourtheye

Reputation: 239443

a = set([1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.])
a.remove(max(a))
print max(a)

This uses set to make sure that we deal with only unique items and then we remove the maximum value, so that next time when we call max, we ll be left with the second best maximum number.

Upvotes: 2

NPE
NPE

Reputation: 500257

>>> max(val for val in a if val != 9999)
7.4

This has O(n) time complexity.

If the 9999 isn't fixed, you can generalize this by using max(a) instead of 9999:

>>> maxa = max(a)
>>> max(val for val in a if val != maxa)
7.4

(Although I suspect this isn't what you want.)

Upvotes: 3

Related Questions