Reputation: 61044
Suppose I have a list of tuples:
x = [(1,2), (3,4), (7,4), (5,4)]
Of all tuples that share the second element, I want to preserve the tuple with the largest first element:
y = [(1,2), (7,4)]
What is the best way to achieve this in Python?
Thanks for the answers.
collections
has to offer!Upvotes: 9
Views: 2156
Reputation: 71004
use collections.defaultdict
import collections
max_elements = collections.defaultdict(tuple)
for item in x:
if item > max_elements[item[1]]:
max_elements[item[1]] = item
y = max_elements.values()
Upvotes: 5
Reputation: 40029
If you can make the assumption that tuples with identical second elements appear in contiguous order in the original list x
, you can leverage itertools.groupby
:
import itertools
import operator
def max_first_elem(x):
groups = itertools.groupby(x, operator.itemgetter(1))
y = [max(g[1]) for g in groups]
return y
Note that this will guarantee preservation of the order of the groups (by the second tuple element), if that is a desired constraint for the output.
Upvotes: 2
Reputation: 10162
>>> from collections import defaultdict
>>> d = defaultdict(tuple)
>>> x = [(1,2), (3,4), (7,4), (5,4)]
>>> for a, b in x:
... d[b] = max(d[b], (a, b))
...
>>> d.values()
[(1, 2), (7, 4)
Upvotes: 0
Reputation: 304147
Similar to Aaron's answer
>>> from collections import defaultdict
>>> x = [(1,2), (3,4), (7,4), (5,4)]
>>> d = defaultdict(int)
>>> for v,k in x:
... d[k] = max(d[k],v)
...
>>> y=[(k,v) for v,k in d.items()]
>>> y
[(1, 2), (7, 4)]
note that the order is not preserved with this method. To preserve the order use this instead
>>> y = [(k,v) for k,v in x if d[v]==k]
>>> y
[(1, 2), (7, 4)]
here is another way. It uses more storage, but has less calls to max, so it may be faster
>>> d = defaultdict(list)
>>> for k,v in x:
... d[v].append(k)
...
>>> y = [(max(k),v) for v,k in d.items()]
>>> y
[(1, 2), (7, 4)]
Again, a simple modification preserves the order
>>> y = [(k,v) for k,v in x if max(d[v])==k]
>>> y
[(1, 2), (7, 4)]
Upvotes: 5
Reputation: 61044
My own attempt, slightly inspired by aaronsterling:
(oh yeah, all elements are nonnegative)
def processtuples(x):
d = {}
for item in x:
if x[0] > d.get(x[1],-1):
d[x[1]] = x[0]
y = []
for k in d:
y.append((d[k],k))
y.sort()
return y
Upvotes: 0