Reputation: 177
I have the following list:
l = [(('01001', '01003'), 4.15),
(('01001', '01005'), 2.83),
(('01001', '01007'), 3.32),
(('01001', '01008'), 2.32),
(('01001', '01009'), 9.32),
(('01001', '01007'), 0.32),
(('01002', '01009'), 6.83),
(('01002', '01011'), 2.53),
(('01002', '01009'), 6.83),
(('01002', '01011'), 2.53),
(('01002', '01009'), 6.83),
(('01002', '01011'), 2.53),
(('01003', '01013'), 20.50),
(('01003', '01013'), 10.50),
(('01003', '01013'), 0.50),
(('01003', '01013'), 2.50),
(('01003', '01013'), 20.30),
(('01003', '01013'), 12.50),
(('01003', '01013'), 1.50),
(('01003', '01013'), 2.40)]
I would like to select the n-smallest values for the first element of this list ('01001', '01002', and '01003').
I was able to calcualte the min value with this code:
from itertools import groupby
from statistics import mean
{k:min(v for *_, v in v) for k,v in groupby(result_map, lambda x: x[0][0])}
but would like to get the 3 smallest values and the second column to be printed:
Expected outcome would be a dictionary like this:
{'01001': ['01007', '01008', '01005'], '01002': ['01011', '01009', '01013'] , '01003': ['01013', '01013', ''01013']}
Any help would be much appreciated!
Upvotes: 3
Views: 124
Reputation: 16767
{k:[e[0][1] for e in sorted(v, key = lambda x: x[1])][:n] for k,v in groupby(result_map, lambda x: x[0][0])}
this above is your provided code with groupby but modified a bit to compute n-smallest list instead of min.
From your question's example it wasn't clear if you want repeated elements in n-smallest list or not (second entry '01002': ['01011', '01009', '01013']
has no repetitions, but third '01003': ['01013', '01013', ''01013']
has repetitions in your example), so I provide second one-liner to solve task without repetitions:
{k:[e[0][1] for e in sorted({f[0][1] : f for f in v}.values(), key = lambda x: x[1])][:n] for k,v in groupby(result_map, lambda x: x[0][0])}
Full version of code can be found and tried online here!
Upvotes: 1
Reputation: 46921
a pretty explicit but straight-forward version. i iterate once only over the input list lst
:
from bisect import bisect_left
from collections import defaultdict
lst = [(('01001', '01003'), 4.15),
...
(('01003', '01013'), 2.40)]
maxlen = 3
ret = defaultdict(list)
val = defaultdict(list)
for ((first, second), value) in lst:
r = ret[first]
v = val[first]
if not r:
r.append(second)
v.append(value)
else:
if value not in v:
idx = bisect_left(v, value)
r.insert(idx, second)
v.insert(idx, value)
if len(r) > maxlen:
ret[first] = r[:3]
val[first] = v[:3]
print(ret) # defaultdict(<class 'list'>, {
# '01001': ['01007', '01008', '01005'],
# '01002': ['01011', '01009'],
# '01003': ['01013', '01013', '01013']})
print(val) # defaultdict(<class 'list'>, {
# '01001': [0.32, 2.32, 2.83],
# '01002': [2.53, 6.83],
# '01003': [0.5, 1.5, 2.4]})
where i use the defaultdict
val
to store the values corresponding to the result res
.
and i use the bisect
module to find the insert index idx
.
the design might be better if the values and the results were in the same data structure and not separated in ret
and val
(e.g a tuple or even a namedtuple).
Upvotes: 0
Reputation: 10624
The following should work:
d={i:sorted([k[0][1] for k in l if k[0][0]==i])[:3] for i in set([i[0][0] for i in l])}
print(d)
{'01001': ['01003', '01005', '01007'], '01002': ['01009', '01009', '01009'], '01003': ['01013', '01013', '01013']}
Upvotes: 1