Reputation: 5396
I have a set of data similar to this:
trades = [{'ORDERID': 123, 'LEAVESQTY': 2200},
{'ORDERID': 123, 'LEAVESQTY': 500},
{'ORDERID': 456, 'LEAVESQTY': 100},
{'ORDERID': 789, 'LEAVESQTY': 300}]
I am looking for an efficient way to find the lowest value of LEAVESQTY
for each matching ORDERID
in the list of dictionaries.
For example, my desired result for this example would be:
output = [{'ORDERID': 123, 'LEAVESQTY': 5000},
{'ORDERID': 456, 'LEAVESQTY': 100},
{'ORDERID': 789, 'LEAVESQTY': 300}]
I have tried to apply the same approach as similar questions that merge dictionaries based on matching key values, i'm just having trouble merging based on values.
Upvotes: 1
Views: 72
Reputation: 27879
You can use itertools.groupby():
from itertools import groupby
common = lambda x: x['ORDERID']
grouped = groupby(sorted(trades, key=common), key=common)
output = [min(group) for key, group in grouped]
output
#[{'LEAVESQTY': 500, 'ORDERID': 123},
# {'LEAVESQTY': 100, 'ORDERID': 456},
# {'LEAVESQTY': 300, 'ORDERID': 789}]
EDIT
The sorted is necessary as @jpp suggested for this to work as it will otherwise produce duplicate keys if the data looks like this:
#trades = [{'ORDERID': 789, 'LEAVESQTY': 400},
# {'ORDERID': 123, 'LEAVESQTY': 2200},
# {'ORDERID': 123, 'LEAVESQTY': 500},
# {'ORDERID': 456, 'LEAVESQTY': 100},
# {'ORDERID': 789, 'LEAVESQTY': 300},
# {'ORDERID': 789, 'LEAVESQTY': 150}]
Upvotes: 1
Reputation: 164773
You can use collections.defaultdict
to construct a dictionary to list mapping.
Then use a dictionary comprehension to calculate the minimum of each list.
from collections import defaultdict
trades = [{'ORDERID': 123, 'LEAVESQTY': 2200},
{'ORDERID': 123, 'LEAVESQTY': 500},
{'ORDERID': 456, 'LEAVESQTY': 100},
{'ORDERID': 789, 'LEAVESQTY': 300}]
d = defaultdict(list)
for item in trades:
d[item['ORDERID']].append(item['LEAVESQTY'])
res = [{'ORDERID': k, 'LEAVESQTY': min(v)} for k, v in d.items()]
[{'LEAVESQTY': 500, 'ORDERID': 123},
{'LEAVESQTY': 100, 'ORDERID': 456},
{'LEAVESQTY': 300, 'ORDERID': 789}]
Upvotes: 1
Reputation: 82785
Using a simple iteration.
Demo:
d = {}
trades = [{'ORDERID': 123, 'LEAVESQTY': 2200},
{'ORDERID': 123, 'LEAVESQTY': 500},
{'ORDERID': 456, 'LEAVESQTY': 100},
{'ORDERID': 789, 'LEAVESQTY': 300}]
for i in trades:
if i['ORDERID'] not in d:
d[i["ORDERID"]] = i
else:
if d[i["ORDERID"]]["LEAVESQTY"] > i["LEAVESQTY"]:
d[i["ORDERID"]]["LEAVESQTY"] = i["LEAVESQTY"]
print(d.values())
Output:
[{'ORDERID': 456, 'LEAVESQTY': 100}, {'ORDERID': 123, 'LEAVESQTY': 500}, {'ORDERID': 789, 'LEAVESQTY': 300}]
Upvotes: 1
Reputation: 164773
Here's one way using sorted
and toolz.unique
. The idea is to sort by LEAVESQTY
and then drop duplicates by ORDERID
.
If you don't have access to the toolz
library, the logic is identical to the unique_everseen
recipe found in the itertools
docs.
from operator import itemgetter
from toolz import unique
trades = [{'ORDERID': 123, 'LEAVESQTY': 2200},
{'ORDERID': 123, 'LEAVESQTY': 500},
{'ORDERID': 456, 'LEAVESQTY': 100},
{'ORDERID': 789, 'LEAVESQTY': 300}]
sorter = sorted(trades, key=itemgetter('LEAVESQTY'))
res = list(unique(sorter, key=itemgetter('ORDERID')))
print(res)
[{'LEAVESQTY': 100, 'ORDERID': 456},
{'LEAVESQTY': 300, 'ORDERID': 789},
{'LEAVESQTY': 500, 'ORDERID': 123}]
Upvotes: 2