Reputation: 849
I have a list like this:
data = [
{'date':'2017-01-02', 'model': 'iphone5', 'feature':'feature1'},
{'date':'2017-01-02', 'model': 'iphone7', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone6', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone6', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone7', 'feature':'feature3'},
{'date':'2017-01-10', 'model': 'iphone7', 'feature':'feature2'},
{'date':'2017-01-10', 'model': 'iphone7', 'feature':'feature1'},
]
I want to achieve this:
[
{
'2017-01-02':[{'iphone5':['feature1']}, {'iphone7':['feature2']}]
},
{
'2017-01-03': [{'iphone6':['feature2']}, {'iphone7':['feature3']}]
},
{
'2017-01-10':[{'iphone7':['feature2', 'feature1']}]
}
]
I need an efficient way, since it could be much data.
I was trying this:
data = sorted(data, key=itemgetter('date'))
date = itertools.groupby(data, key=itemgetter('date'))
But I'm getting nothing for the value of the 'date' key.
Later I will iterate over this structure for building an HTML.
Upvotes: 2
Views: 365
Reputation: 18625
You can do this pretty efficiently and cleanly using defaultdict
. Unfortunately it's a pretty advanced use and it gets hard to read.
from collections import defaultdict
from pprint import pprint
# create a dictionary whose elements are automatically dictionaries of sets
result_dict = defaultdict(lambda: defaultdict(set))
# Construct a dictionary with one key for each date and another dict ('model_dict')
# as the value.
# The model_dict has one key for each model and a set of features as the value.
for d in data:
result_dict[d["date"]][d["model"]].add(d["feature"])
# more explicit version:
# for d in data:
# model_dict = result_dict[d["date"]] # created automatically if needed
# feature_set = model_dict[d["model"]] # created automatically if needed
# feature_set.add(d["feature"])
# convert the result_dict into the required form
result_list = [
{
date: [
{phone: list(feature_set)}
for phone, feature_set in sorted(model_dict.items())
]
} for date, model_dict in sorted(result_dict.items())
]
pprint(result_list)
# [{'2017-01-02': [{'iphone5': ['feature1']}, {'iphone7': ['feature2']}]},
# {'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
# {'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}]
Upvotes: 3
Reputation: 11477
You can try this, here is my way, td
is a dict to store { iphone : index }
to check if the new item exist in the list of dict:
from itertools import groupby
from operator import itemgetter
r = []
for i in groupby(sorted(data, key=itemgetter('date')), key=itemgetter('date')):
td, tl = {}, []
for j in i[1]:
if j["model"] not in td:
tl.append({j["model"]: [j["feature"]]})
td[j["model"]] = len(tl) - 1
elif j["feature"] not in tl[td[j["model"]]][j["model"]]:
tl[td[j["model"]]][j["model"]].append(j["feature"])
r.append({i[0]: tl})
Result:
[
{'2017-01-02': [{'iphone5': ['feature1']}, {'iphone7': ['feature2']}]},
{'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
{'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}
]
As matter of fact, I think the data structure can be simplified, maybe you don't need so many nesting.
Upvotes: 1
Reputation: 512
total_result = list()
result = dict()
inner_value = dict()
for d in data:
if d["date"] not in result:
if result:
total_result.append(result)
result = dict()
result[d["date"]] = set()
inner_value = dict()
if d["model"] not in inner_value:
inner_value[d["model"]] = set()
inner_value[d["model"]].add(d["feature"])
tmp_v = [{key: list(inner_value[key])} for key in inner_value]
result[d["date"]] = tmp_v
total_result.append(result)
total_result
[{'2017-01-02': [{'iphone7': ['feature2']}, {'iphone5': ['feature1']}]},
{'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
{'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}]
Upvotes: 0