Reputation: 33
I am new to Python and have ran into an issue with the below code.
I was looking for a way to group by multiple keys and summarize/average values of a list of dictionaries in Python. The below code (also located from previous question/response located here: Group by multiple keys and summarize/average values of a list of dictionaries) set me off on the right track but I am running into issues adding more field aggregation in the loop.
Say I have a list of dictionaries as seen below:
input = [
{'msn': '001', 'source': 'foo', 'status': '1', 'qty': 100, 'vol': 100},
{'msn': '001', 'source': 'bar', 'status': '2', 'qty': 200, 'vol': 200},
{'msn': '001', 'source': 'foo', 'status': '1', 'qty': 300, 'vol': 300},
{'msn': '002', 'source': 'baz', 'status': '2', 'qty': 400, 'vol': 100},
{'msn': '002', 'source': 'baz', 'status': '1', 'qty': 500, 'vol': 400},
{'msn': '002', 'source': 'qux', 'status': '1', 'qty': 600, 'vol': 100},
{'msn': '003', 'source': 'foo', 'status': '2', 'qty': 700, 'vol': 200}]
My code so far:
for key, grp in groupby(sorted(dict_list, key = grouper), grouper):
temp_dict = dict(zip(["msn", "source"], key))
temp_dict["qty"] = sum(item["qty"] for item in grp)
temp_dict["vol"] = sum(item["vol"] for item in grp)
result.append(temp_dict)
Expected result was:
{'msn': '001', 'source': 'foo', 'qty': 400, 'vol': 400},
{'msn': '001', 'source': 'bar', 'qty': 200, 'vol': 200},
{'msn': '002', 'source': 'baz', 'qty': 200, 'vol': 500},
{'msn': '003', 'source': 'foo', 'qty': 900, 'vol': 200}]
Placement of temp_dict["vol"] = sum(item["vol"] for item in grp)
within the for loop does not produce the desired results which is ultimately my issue.
How do I go about keeping the key, grouping as seen in the code while adding(appending) another field and its calculated value to the list?
Thanks in advance for any help.
Upvotes: 3
Views: 996
Reputation: 53029
You need to "copy" grp
if you want to iterate through it multiple times, itertools.tee
can do that for you
for key, grp in groupby(sorted(dict_list, key = grouper), grouper):
temp_dict = dict(zip(["msn", "source"], key))
grp1, grp2 = tee(grp)
temp_dict["qty"] = sum(item["qty"] for item in grp1)
temp_dict["vol"] = sum(item["vol"] for item in grp2)
result.append(temp_dict)
Upvotes: 1