cwohlfart
cwohlfart

Reputation: 177

Get the n-smallest values from a nested python list

I have the following list:

l = [(('01001', '01003'), 4.15),
 (('01001', '01005'), 2.83),
 (('01001', '01007'), 3.32),
 (('01001', '01008'), 2.32),
 (('01001', '01009'), 9.32),
 (('01001', '01007'), 0.32),
 (('01002', '01009'), 6.83),
 (('01002', '01011'), 2.53),
 (('01002', '01009'), 6.83),
 (('01002', '01011'), 2.53),
 (('01002', '01009'), 6.83),
 (('01002', '01011'), 2.53),
 (('01003', '01013'), 20.50),
 (('01003', '01013'), 10.50),
 (('01003', '01013'), 0.50),
 (('01003', '01013'), 2.50),
 (('01003', '01013'), 20.30),
 (('01003', '01013'), 12.50),
 (('01003', '01013'), 1.50),
 (('01003', '01013'), 2.40)]

I would like to select the n-smallest values for the first element of this list ('01001', '01002', and '01003').

I was able to calcualte the min value with this code:

from itertools import groupby
from statistics import mean

{k:min(v for *_, v in v) for k,v in groupby(result_map, lambda x: x[0][0])}

but would like to get the 3 smallest values and the second column to be printed:

Expected outcome would be a dictionary like this:

{'01001': ['01007', '01008', '01005'], '01002': ['01011', '01009', '01013']  , '01003': ['01013', '01013', ''01013']}

Any help would be much appreciated!

Upvotes: 3

Views: 124

Answers (3)

Arty
Arty

Reputation: 16767

{k:[e[0][1] for e in sorted(v, key = lambda x: x[1])][:n] for k,v in groupby(result_map, lambda x: x[0][0])}

this above is your provided code with groupby but modified a bit to compute n-smallest list instead of min.

From your question's example it wasn't clear if you want repeated elements in n-smallest list or not (second entry '01002': ['01011', '01009', '01013'] has no repetitions, but third '01003': ['01013', '01013', ''01013'] has repetitions in your example), so I provide second one-liner to solve task without repetitions:

{k:[e[0][1] for e in sorted({f[0][1] : f for f in v}.values(), key = lambda x: x[1])][:n] for k,v in groupby(result_map, lambda x: x[0][0])}

Full version of code can be found and tried online here!

Upvotes: 1

hiro protagonist
hiro protagonist

Reputation: 46921

a pretty explicit but straight-forward version. i iterate once only over the input list lst:

from bisect import bisect_left
from collections import defaultdict

lst = [(('01001', '01003'), 4.15),
       ...
       (('01003', '01013'), 2.40)]

maxlen = 3

ret = defaultdict(list)
val = defaultdict(list)
for ((first, second), value) in lst:
    r = ret[first]
    v = val[first]
    if not r:
        r.append(second)
        v.append(value)
    else:
        if value not in v:
            idx = bisect_left(v, value)
            r.insert(idx, second)
            v.insert(idx, value)
    if len(r) > maxlen:
        ret[first] = r[:3]
        val[first] = v[:3]

print(ret)  # defaultdict(<class 'list'>, {
#  '01001': ['01007', '01008', '01005'], 
#  '01002': ['01011', '01009'], 
#  '01003': ['01013', '01013', '01013']})

print(val)  # defaultdict(<class 'list'>, {
#  '01001': [0.32, 2.32, 2.83], 
#  '01002': [2.53, 6.83], 
#  '01003': [0.5, 1.5, 2.4]})

where i use the defaultdict val to store the values corresponding to the result res.

and i use the bisect module to find the insert index idx.

the design might be better if the values and the results were in the same data structure and not separated in ret and val (e.g a tuple or even a namedtuple).

Upvotes: 0

IoaTzimas
IoaTzimas

Reputation: 10624

The following should work:

d={i:sorted([k[0][1] for k in l if k[0][0]==i])[:3] for i in set([i[0][0] for i in l])}

print(d)

{'01001': ['01003', '01005', '01007'], '01002': ['01009', '01009', '01009'], '01003': ['01013', '01013', '01013']}

Upvotes: 1

Related Questions