Reputation: 23482
I have a bit of code that is trying to transform a dictionary from one nesting format to another using a series of for
loops so that I can easily export the dictionary to a CSV file. However, as my script loops through the input dict, it overwrites the output dict rather than appending the additional values, and I can't figure out why.
Here's the format of the input dictionary:
{'data': [{'title': 'Lifetime Likes by Country',
'values': [{'end_time': '2013-11-10T08:00:00+0000',
'value': {'IN': 343818, 'PK': 212632, 'US': 886367}},
{'end_time': '2013-11-11T08:00:00+0000',
'value': {'IN': 344025, 'US': 886485}}]},
{'title': 'Daily Country: People Talking About This',
'values': [{'end_time': '2013-11-10T08:00:00+0000',
'value': {'IN': 289, 'US': 829}},
{'end_time': '2013-11-11T08:00:00+0000',
'value': {'IN': 262, 'US': 836}}]}]}
Here's my code:
input_dict = function_to_get_input_dict()
filtered_dict = {}
for metric in input_dict['data']:
for day in metric['values']:
parsed_date = parser.parse(day['end_time'])
date_key = parsed_date.strftime('%m/%d/%Y')
filtered_dict[date_key] = {}
filtered_dict[date_key]['Total %s' % metric['title']] = 0
for k, v in day['value'].iteritems():
filtered_dict[date_key]['%s : %s' % (metric['title'], k)] = v
filtered_dict[date_key]['Total %s' % metric['title']] += v
pprint(filtered_dict) #debug
Expected output dictionary format:
{date1:{metric_1_each_country_code:value, metric_1_all_country_total:value, metric_2_each_country_code:value, metric_2_all_country_total:value}, date2:{etc}}
However, instead I'm getting an output dictionary that only has one metric per date:
{date1:{metric_2_each_country_code:value, metric_2_all_country_total:value}, date2:{etc}}
It appears to be overwriting the metric key:value pair each time, which I don't understand because the key's should be unique to each metric using the ['%s : %s' % (metric['title'], k)]
formula, so they shouldn't get overwritten.
What am I missing?
Upvotes: 1
Views: 1081
Reputation: 394915
I think one problem is that your data has syntax errors in it and it is nearly impossible to see the structure. I have corrected it and pretty printed the whole thing to help you better see its structure. Not a complete answer, but it goes a long way towards helping solve the problem:
import pprint; pprint.pprint({"data": [{ "values": [{ "value": { "US": 886367, "IN": 343818, "PK": 212632}, "end_time": "2013-11-10T08:00:00+0000"},{"value": { "US": 886485, "IN": 344025}, "end_time": "2013-11-11T08:00:00+0000"}], "title": "Lifetime Likes by Country"}, {"values": [{"value": { "US": 829, "IN": 289}, "end_time": "2013-11-10T08:00:00+0000"},{"value": {"US": 836,"IN": 262}, "end_time": "2013-11-11T08:00:00+0000"}], "title": "Daily Country: People Talking About This"}]})
{'data': [{'title': 'Lifetime Likes by Country',
'values': [{'end_time': '2013-11-10T08:00:00+0000',
'value': {'IN': 343818, 'PK': 212632, 'US': 886367}},
{'end_time': '2013-11-11T08:00:00+0000',
'value': {'IN': 344025, 'US': 886485}}]},
{'title': 'Daily Country: People Talking About This',
'values': [{'end_time': '2013-11-10T08:00:00+0000',
'value': {'IN': 289, 'US': 829}},
{'end_time': '2013-11-11T08:00:00+0000',
'value': {'IN': 262, 'US': 836}}]}]}
Now that I can see the nature of your data, perhaps this type of data structure would better suit your needs:
import pprint; pprint.pprint({'Daily Country: People Talking About This': {'2013-11-11T08:00:00+0000': {'US': 836, 'IN': 262}, '2013-11-10T08:00:00+0000': {'US': 829, 'IN': 289}}, 'Lifetime Likes by Country': {'2013-11-11T08:00:00+0000': {'US': 886485, 'IN': 344025}, '2013-11-10T08:00:00+0000': {'PK': 212632, 'US': 886367, 'IN': 343818}}})
Which gives you:
{'Daily Country: People Talking About This': {'2013-11-10T08:00:00+0000': {'IN': 289,
'US': 829},
'2013-11-11T08:00:00+0000': {'IN': 262,
'US': 836}},
'Lifetime Likes by Country': {'2013-11-10T08:00:00+0000': {'IN': 343818,
'PK': 212632,
'US': 886367},
'2013-11-11T08:00:00+0000': {'IN': 344025,
'US': 886485}}}
Upvotes: 0
Reputation: 601
If you notice in your code, in the second for
loop you have filtered_dict[date_key] = {}
. This resets the value of filtered_dict[date_key]
instead of allowing you to add to it.
input_dict = function_to_get_input_dict()
filtered_dict = {}
for metric in input_dict['data']:
for day in metric['values']:
parsed_date = parser.parse(day['end_time'])
date_key = parsed_date.strftime('%m/%d/%Y')
filtered_dict[date_key] = {}
filtered_dict[date_key]['Total %s' % metric['title']] = 0
for k, v in day['value'].iteritems():
filtered_dict[date_key]['%s : %s' % (metric['title'], k)] = v
filtered_dict[date_key]['Total %s' % metric['title']] += v
pprint(filtered_dict) #debug
Upvotes: 1