Jeff Widman
Jeff Widman

Reputation: 23482

Why is this dictionary overwriting itself during for loop?

I have a bit of code that is trying to transform a dictionary from one nesting format to another using a series of for loops so that I can easily export the dictionary to a CSV file. However, as my script loops through the input dict, it overwrites the output dict rather than appending the additional values, and I can't figure out why.

Here's the format of the input dictionary:

{'data': [{'title': 'Lifetime Likes by Country',
           'values': [{'end_time': '2013-11-10T08:00:00+0000',
                       'value': {'IN': 343818, 'PK': 212632, 'US': 886367}},
                      {'end_time': '2013-11-11T08:00:00+0000',
                       'value': {'IN': 344025, 'US': 886485}}]},
          {'title': 'Daily Country: People Talking About This',
           'values': [{'end_time': '2013-11-10T08:00:00+0000',
                       'value': {'IN': 289, 'US': 829}},
                      {'end_time': '2013-11-11T08:00:00+0000',
                       'value': {'IN': 262, 'US': 836}}]}]}

Here's my code:

input_dict = function_to_get_input_dict()
filtered_dict = {}
for metric in input_dict['data']:
    for day in metric['values']:
        parsed_date = parser.parse(day['end_time'])
        date_key = parsed_date.strftime('%m/%d/%Y')
        filtered_dict[date_key] = {}
        filtered_dict[date_key]['Total %s' % metric['title']] = 0
        for k, v in day['value'].iteritems():
            filtered_dict[date_key]['%s : %s' % (metric['title'], k)] = v
            filtered_dict[date_key]['Total %s' % metric['title']] += v
pprint(filtered_dict) #debug

Expected output dictionary format:

{date1:{metric_1_each_country_code:value, metric_1_all_country_total:value, metric_2_each_country_code:value, metric_2_all_country_total:value}, date2:{etc}}

However, instead I'm getting an output dictionary that only has one metric per date: {date1:{metric_2_each_country_code:value, metric_2_all_country_total:value}, date2:{etc}}

It appears to be overwriting the metric key:value pair each time, which I don't understand because the key's should be unique to each metric using the ['%s : %s' % (metric['title'], k)] formula, so they shouldn't get overwritten.

What am I missing?

Upvotes: 1

Views: 1081

Answers (2)

Aaron Hall
Aaron Hall

Reputation: 394915

I think one problem is that your data has syntax errors in it and it is nearly impossible to see the structure. I have corrected it and pretty printed the whole thing to help you better see its structure. Not a complete answer, but it goes a long way towards helping solve the problem:

import pprint; pprint.pprint({"data": [{ "values": [{ "value": { "US": 886367, "IN": 343818, "PK": 212632}, "end_time": "2013-11-10T08:00:00+0000"},{"value": { "US": 886485, "IN": 344025}, "end_time": "2013-11-11T08:00:00+0000"}], "title": "Lifetime Likes by Country"}, {"values": [{"value": { "US": 829, "IN": 289}, "end_time": "2013-11-10T08:00:00+0000"},{"value": {"US": 836,"IN": 262}, "end_time": "2013-11-11T08:00:00+0000"}], "title": "Daily Country: People Talking About This"}]})
{'data': [{'title': 'Lifetime Likes by Country',
           'values': [{'end_time': '2013-11-10T08:00:00+0000',
                       'value': {'IN': 343818, 'PK': 212632, 'US': 886367}},
                      {'end_time': '2013-11-11T08:00:00+0000',
                       'value': {'IN': 344025, 'US': 886485}}]},
          {'title': 'Daily Country: People Talking About This',
           'values': [{'end_time': '2013-11-10T08:00:00+0000',
                       'value': {'IN': 289, 'US': 829}},
                      {'end_time': '2013-11-11T08:00:00+0000',
                       'value': {'IN': 262, 'US': 836}}]}]}

Now that I can see the nature of your data, perhaps this type of data structure would better suit your needs:

import pprint; pprint.pprint({'Daily Country: People Talking About This': {'2013-11-11T08:00:00+0000': {'US': 836, 'IN': 262}, '2013-11-10T08:00:00+0000': {'US': 829, 'IN': 289}}, 'Lifetime Likes by Country': {'2013-11-11T08:00:00+0000': {'US': 886485, 'IN': 344025}, '2013-11-10T08:00:00+0000': {'PK': 212632, 'US': 886367, 'IN': 343818}}})

Which gives you:

{'Daily Country: People Talking About This': {'2013-11-10T08:00:00+0000': {'IN': 289,
                                                                           'US': 829},
                                              '2013-11-11T08:00:00+0000': {'IN': 262,
                                                                           'US': 836}},
 'Lifetime Likes by Country': {'2013-11-10T08:00:00+0000': {'IN': 343818,
                                                            'PK': 212632,
                                                            'US': 886367},
                               '2013-11-11T08:00:00+0000': {'IN': 344025,
                                                            'US': 886485}}}

Upvotes: 0

adityajones
adityajones

Reputation: 601

If you notice in your code, in the second for loop you have filtered_dict[date_key] = {}. This resets the value of filtered_dict[date_key] instead of allowing you to add to it.

input_dict = function_to_get_input_dict()
filtered_dict = {}
for metric in input_dict['data']:
    for day in metric['values']:
        parsed_date = parser.parse(day['end_time'])
        date_key = parsed_date.strftime('%m/%d/%Y')
        filtered_dict[date_key] = {}
        filtered_dict[date_key]['Total %s' % metric['title']] = 0
        for k, v in day['value'].iteritems():
            filtered_dict[date_key]['%s : %s' % (metric['title'], k)] = v
            filtered_dict[date_key]['Total %s' % metric['title']] += v
pprint(filtered_dict) #debug

Upvotes: 1

Related Questions