Reputation: 23482
I'm pretty sure this is a n00b question, but I can't seem to figure it out. Any help appreciated.
I've got an application that generates a series of files and within each file is a dictionary formatted as:
{date1:{key1:result1, key2:result2},date2:{key2:result3}}
I want to figure out the daily average for each value. So I'd like to create one dictionary per unique key that aggregates results from across all the files:
unique_key_dict = {date1:[file1_result, file2_result],date2:[file1_result, file2_result]}
I won't know in advance the names of the keys or how many unique keys there will be, although it wont' be more than 25 unique keys across my entire dataset, and for speed reasons, I only want to open each file once.
How do I write the following in Python?
for date in file_dict:
for key in file_dict[date]:
# if key_dict does not exist from a previous file or date, create it
# once the dictionary exists, append this value to the list tied to the date key.
I just can't seem to figure out how to dynamically create a dictionary using the name of the key. If I were dynamically printing their names I'd do "dict_for_%s" % key
but I'm not trying to print, I'm trying to create dictionaries.
Also, I could just create a single massive dict... which is faster? A single massive dict or 15-25 separate dictionaries?
Upvotes: 0
Views: 659
Reputation: 3867
This does part of it:
unique_key_dict = {}
for date in file_dict:
for key in file_dict[date]:
if date not in unique_key_dict: unique_key_dict[date] = []
unique_key_dict[date].append(file_dict[date][key])
Or perhaps you want
unique_key_dict = {}
for date in file_dict:
for key in file_dict[date]:
if key not in unique_key_dict: unique_key_dict[key] = {}
if date not in unique_key_dict[key]: unique_key_dict[key][date] = []
unique_key_dict[key][date].append(file_dict[date][key])
Then you have a dict which maps each key to a dict, and these dicts map dates to arrays of values.
To get averages after that:
for key in unique_key_dict:
for date in unique_key_dict[key]:
avg = sum(float(x) for x in unique_key_dict[key][date]) / len(unique_key_dict[key][date])
print key, date, avg
Upvotes: 2