Harrison
Harrison

Reputation: 5396

Merge list of dictionaries based on matching values

I have a set of data similar to this:

trades =  [{'ORDERID': 123, 'LEAVESQTY': 2200}, 
            {'ORDERID': 123, 'LEAVESQTY': 500}, 
            {'ORDERID': 456, 'LEAVESQTY': 100}, 
            {'ORDERID': 789, 'LEAVESQTY': 300}]

I am looking for an efficient way to find the lowest value of LEAVESQTY for each matching ORDERID in the list of dictionaries.

For example, my desired result for this example would be:

output = [{'ORDERID': 123, 'LEAVESQTY': 5000},
          {'ORDERID': 456, 'LEAVESQTY': 100},
          {'ORDERID': 789, 'LEAVESQTY': 300}]

I have tried to apply the same approach as similar questions that merge dictionaries based on matching key values, i'm just having trouble merging based on values.

Upvotes: 1

Views: 72

Answers (4)

zipa
zipa

Reputation: 27879

You can use itertools.groupby():

from itertools import groupby

common = lambda x: x['ORDERID']

grouped = groupby(sorted(trades, key=common), key=common)

output = [min(group) for key, group in grouped]

output
#[{'LEAVESQTY': 500, 'ORDERID': 123},
# {'LEAVESQTY': 100, 'ORDERID': 456},
# {'LEAVESQTY': 300, 'ORDERID': 789}]

EDIT

The sorted is necessary as @jpp suggested for this to work as it will otherwise produce duplicate keys if the data looks like this:

#trades =  [{'ORDERID': 789, 'LEAVESQTY': 400},
#           {'ORDERID': 123, 'LEAVESQTY': 2200}, 
#            {'ORDERID': 123, 'LEAVESQTY': 500}, 
#            {'ORDERID': 456, 'LEAVESQTY': 100}, 
#            {'ORDERID': 789, 'LEAVESQTY': 300},
#            {'ORDERID': 789, 'LEAVESQTY': 150}]

Upvotes: 1

jpp
jpp

Reputation: 164773

You can use collections.defaultdict to construct a dictionary to list mapping.

Then use a dictionary comprehension to calculate the minimum of each list.

from collections import defaultdict

trades =  [{'ORDERID': 123, 'LEAVESQTY': 2200}, 
           {'ORDERID': 123, 'LEAVESQTY': 500}, 
           {'ORDERID': 456, 'LEAVESQTY': 100}, 
           {'ORDERID': 789, 'LEAVESQTY': 300}]

d = defaultdict(list)

for item in trades:
    d[item['ORDERID']].append(item['LEAVESQTY'])

res = [{'ORDERID': k, 'LEAVESQTY': min(v)} for k, v in d.items()]

[{'LEAVESQTY': 500, 'ORDERID': 123},
 {'LEAVESQTY': 100, 'ORDERID': 456},
 {'LEAVESQTY': 300, 'ORDERID': 789}]

Upvotes: 1

Rakesh
Rakesh

Reputation: 82785

Using a simple iteration.

Demo:

d = {}

trades =  [{'ORDERID': 123, 'LEAVESQTY': 2200}, 
            {'ORDERID': 123, 'LEAVESQTY': 500}, 
            {'ORDERID': 456, 'LEAVESQTY': 100}, 
            {'ORDERID': 789, 'LEAVESQTY': 300}]

for i in trades:
    if i['ORDERID'] not in d:
        d[i["ORDERID"]] = i
    else:
        if d[i["ORDERID"]]["LEAVESQTY"] > i["LEAVESQTY"]:
            d[i["ORDERID"]]["LEAVESQTY"] = i["LEAVESQTY"]
print(d.values())

Output:

[{'ORDERID': 456, 'LEAVESQTY': 100}, {'ORDERID': 123, 'LEAVESQTY': 500}, {'ORDERID': 789, 'LEAVESQTY': 300}]

Upvotes: 1

jpp
jpp

Reputation: 164773

Here's one way using sorted and toolz.unique. The idea is to sort by LEAVESQTY and then drop duplicates by ORDERID.

If you don't have access to the toolz library, the logic is identical to the unique_everseen recipe found in the itertools docs.

from operator import itemgetter
from toolz import unique

trades =  [{'ORDERID': 123, 'LEAVESQTY': 2200}, 
           {'ORDERID': 123, 'LEAVESQTY': 500}, 
           {'ORDERID': 456, 'LEAVESQTY': 100}, 
           {'ORDERID': 789, 'LEAVESQTY': 300}]

sorter = sorted(trades, key=itemgetter('LEAVESQTY'))
res = list(unique(sorter, key=itemgetter('ORDERID')))

print(res)

[{'LEAVESQTY': 100, 'ORDERID': 456},
 {'LEAVESQTY': 300, 'ORDERID': 789},
 {'LEAVESQTY': 500, 'ORDERID': 123}]

Upvotes: 2

Related Questions