Reputation: 433
I receive data from an internal interface that comes as a list of dicts where multiple of those dicts represent a data-record if they where combined.
Data looks similar to this:
# received data contains 'duplicate' dict-keys
DATA = [
{"ID": 1234},
{"PRICE": 77.33},
{"DATE": "20201222"},
{"ID": 4567},
{"PRICE": 100.99},
{"DATE": "20201222"}
]
In the above example, a "complete" record would contain the dicts ID
, PRICE
and DATE
.
Unfortunately the dict-keys exists multiple times so when I try something like this:
result = {}
for row in DATA:
for idx, val in row.items():
result[idx] = val
print(result)
# {
# 'ID': 4567,
# 'PRICE': 100.99,
# 'DATE': '20201222'
# }
The dict-keys (obviously) overwrite themselves.
I can't find a solution on how to combine the data into this desired structure:
DESIRED = [
{
"ID": 1234,
"PRICE": 77.33,
"DATE": "20201222"
},
{
"ID": 4567,
"PRICE": 100.99,
"DATE": "20201222"
}
]
Any hints for this? I'm even unsure on how to search for a solution to be honest.
Upvotes: 1
Views: 64
Reputation: 71451
You can use a nested dictionary comprehension:
data = [{'ID': 1234}, {'PRICE': 77.33}, {'DATE': '20201222'}, {'ID': 4567}, {'PRICE': 100.99}, {'DATE': '20201222'}]
r = [{a:b for j in data[i:i+3] for a, b in j.items()} for i in range(0, len(data), 3)]
Output:
[{'ID': 1234, 'PRICE': 77.33, 'DATE': '20201222'}, {'ID': 4567, 'PRICE': 100.99, 'DATE': '20201222'}]
Upvotes: 0
Reputation: 325
if you are looking for an approach which is flexible and can handle any size of data with any names then here it is:
items = {}
DATA = [
{"ID": 1234},
{"PRICE": 77.33},
{"DATE": "20201222"},
{"ID": 4567},
{"PRICE": 100.99},
{"DATE": "20201222"}
]
for i in DATA:
key = list(i.keys())[0]
val = i[key]
if key in items:
items[key].append(val)
else:
items[key] = [val]
output = []
keys = list(items.keys())
values = list(items.values())
for i in range(len(values[0])):
curData = {}
for k in keys:
curData[k] = items[k][i]
output.append(curData)
for i in output:
print(i)
Upvotes: 0
Reputation: 191
Dictionaries do not support duplicate keys. The alternative solution is to create keys with a list of values. This can be done in two different ways:
Using setdefault method: please refer to this link Make a dictionary with duplicate keys in Python!
results = {}
for i, dict in enumerate(DATA):
for k,v in DATA[i].items():
results.setdefault(k, []).append(v)
print(results)
Using defaultdict method: please refer to this link Make a dictionary with duplicate keys in Python!
from collections import defaultdict
default_dict = defaultdict(list)
for i, dict in enumerate(DATA):
for k,v in DATA[i].items():
default_dict[k].append(v)
print(default_dict)
Upvotes: 0
Reputation: 338
They might be a better way to do it but a simple loop with a step of 3 is sufficient. As long as the input data is formatted as you showed it will work. For example:
DATA = [
{"ID": 1234},
{"PRICE": 77.33},
{"DATE": "20201222"},
{"ID": 4567},
{"PRICE": 100.99},
{"DATE": "20201222"}
]
DESIRED = []
for i in range(0,len(DATA),3):
DESIRED.append(DATA[i]) #ID
DESIRED[-1].update(DATA[i+1]) #PRICE
DESIRED[-1].update(DATA[i+2]) #DATE
print(DESIRED)
Upvotes: 0
Reputation: 61910
If the values are always contiguous (and of size 3), you could use zip to iterate in triplets:
DATA = [
{"ID": 1234},
{"PRICE": 77.33},
{"DATE": "20201222"},
{"ID": 4567},
{"PRICE": 100.99},
{"DATE": "20201222"}
]
res = [{**i, **price, **date } for i, price, date in zip(DATA[::3], DATA[1::3], DATA[2::3])]
print(res)
Output
[{'DATE': '20201222', 'ID': 1234, 'PRICE': 77.33},
{'DATE': '20201222', 'ID': 4567, 'PRICE': 100.99}]
An alternative solution is to use, the following for loop:
res = []
for i, price, date in zip(DATA[::3], DATA[1::3], DATA[2::3]):
res.append({"ID": i["ID"], "PRICE": price["PRICE"], "DATE": date["DATE"]})
Upvotes: 2
Reputation: 10799
If your DATA
dictionaries are guaranteed to appear in the order you've shown, and they always appear in groups of three, you can grab three dictionaries at a time and merge them:
from itertools import islice
data = iter([
{"ID": 1234},
{"PRICE": 77.33},
{"DATE": "20201222"},
{"ID": 4567},
{"PRICE": 100.99},
{"DATE": "20201222"}
])
while chunk := list(islice(data, 3)):
id_dict, price_dict, date_dict = chunk
merged = {**id_dict, **price_dict, **date_dict}
print(merged)
Output:
{'ID': 1234, 'PRICE': 77.33, 'DATE': '20201222'}
{'ID': 4567, 'PRICE': 100.99, 'DATE': '20201222'}
>>>
Upvotes: 0