Reputation: 23
The output of my code gives the following:
[{'Total Population:': 4585, 'Total Water Ice Cover': 2.848142234497044, 'Total Developed': 17.205368316575324, 'Total Barren Land': 0.22439908514219134, 'Total Forest': 34.40642126612868},
{'Total Population:': 4751, 'Total Water Ice Cover': 1.047783534830167, 'Total Developed': 37.27115716753022, 'Total Barren Land': 0.11514104778353484, 'Total Forest': 19.11341393206678},
{'Total Population:': 3214, 'Total Water Ice Cover': 0.09166603009701321, 'Total Developed': 23.50469788404247, 'Total Barren Land': 0.2597204186082041, 'Total Forest': 20.418608204109695},
{'Total Population:': 5005, 'Total Water Ice Cover': 0.0, 'Total Developed': 66.37545713124746, 'Total Barren Land': 0.0, 'Total Forest': 10.68671271840715},
...
]
What I'd like to be able to do is get the all the values for 'Total Population' and store that in one list. Then get all the 'Total Water Ice Cover' and store that in another list, and so on. With a data structure like this how does out extract out these values and store them into separate lists?
Thank you
Upvotes: 2
Views: 121
Reputation: 43524
If your goal is to calculate Pearson's correlation, you should use pandas
for this.
Suppose your original list of dictionaries was stored in a variable called output
. You can easily convert it into a pandas
DataFrame using:
import pandas as pd
df = pd.DataFrame(output)
print(df)
# Total Barren Land Total Developed Total Forest Total Population: Total Water Ice Cover
#0 0.224399 17.205368 34.406421 4585 2.848142
#1 0.115141 37.271157 19.113414 4751 1.047784
#2 0.259720 23.504698 20.418608 3214 0.091666
#3 0.000000 66.375457 10.686713 5005 1.047784
Now you can easily generate a correlation matrix:
# this is just to make the output print nicer
pd.set_option("precision",4) # only show 4 digits
# remove 'Total ' from column names to make printing smaller
df.rename(columns=lambda x: x.replace("Total ", ""), inplace=True)
corr = df.corr(method="pearson")
print(corr)
# Barren Land Developed Forest Population: Water Ice Cover
#Barren Land 1.0000 -0.9579 0.7361 -0.7772 0.4001
#Developed -0.9579 1.0000 -0.8693 0.5736 -0.6194
#Forest 0.7361 -0.8693 1.0000 -0.1575 0.9114
#Population: -0.7772 0.5736 -0.1575 1.0000 0.2612
#Water Ice Cover 0.4001 -0.6194 0.9114 0.2612 1.0000
Now you can access individual correlations by key:
print(corr.loc["Forest", "Water Ice Cover"])
#0.91135717479534217
Upvotes: 2
Reputation: 60997
If all the dicts have the same keys, then you can just use the keys of the first dict:
result = {k:[d[k] for d in dictionary_list] for k in dictionary_list[0].keys()}
If the dicts could have different sets of keys, but you're OK with lists of different lengths, I would use a defaultdict
to simplify:
from collections import defaultdict
result = defaultdict(list)
for d in dictionary_list:
for k, v in d.items():
result[k].append(v)
If the dicts could have different sets of keys, and you want all the lists to be the same length, then you'll need to iterate twice. You'll also need some kind of placeholder value to use for when the key is missing. If we want to use None
for that, we can do:
placeholder = None
keys = set()
for d in dictionary_list:
keys += set(d.keys())
result = {k:[] for k in keys}
for d in dictionary_list:
for k in keys:
result[k].append(d.get(k, placeholder))
In each case result
is a dict of lists. If you want a list of lists it's actually even simpler:
result = [[d[k] for d in dictionary_list] for k in dictionary_list[0].keys()]
If you want all the lists to be the same length and include placeholders then you'll still need to use a dict of lists as an intermediate step. But it's easy to transform from a dict of lists to a list of lists of values:
list_of_lists_of_values = list(dict_of_lists_of_values.values())
That said, prior to Python 3.7, dictionaries didn't have a well-defined iteration order, so you're probably better off using a dictionary anyway, because otherwise it's hard to be certain you're getting the right values (e.g. "Total Population" isn't guaranteed to be the first series of values).
Upvotes: 0
Reputation: 9019
You could use pandas
:
pd.DataFrame(my_dict).to_dict(orient='list')
Returns:
{'Total Barren Land': [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0],
'Total Developed': [17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746],
'Total Forest': [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715],
'Total Population:': [4585, 4751, 3214, 5005],
'Total Water Ice Cover': [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0]}
Upvotes: 1
Reputation: 99001
I guess you can use something like:
d = [{'Total Population:': 4585, 'Total Water Ice Cover': 2.848142234497044, 'Total Developed': 17.205368316575324, 'Total Barren Land': 0.22439908514219134, 'Total Forest': 34.40642126612868},
{'Total Population:': 4751, 'Total Water Ice Cover': 1.047783534830167, 'Total Developed': 37.27115716753022, 'Total Barren Land': 0.11514104778353484, 'Total Forest': 19.11341393206678},
{'Total Population:': 3214, 'Total Water Ice Cover': 0.09166603009701321, 'Total Developed': 23.50469788404247, 'Total Barren Land': 0.2597204186082041, 'Total Forest': 20.418608204109695},
{'Total Population:': 5005, 'Total Water Ice Cover': 0.0, 'Total Developed': 66.37545713124746, 'Total Barren Land': 0.0, 'Total Forest': 10.68671271840715}]
f = {}
for l in d:
for k, v in l.items():
if not k in f:
f[k] = []
f[k].append(v)
print(f)
{'Total Population:': [4585, 4751, 3214, 5005], 'Total Water Ice Cover': [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0], 'Total Developed': [17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746], 'Total Barren Land': [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0], 'Total Forest': [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715]}
Upvotes: 1
Reputation: 2905
Call your list of dictionaries dictionary_list
. Then:
keys = {k for d in dictionary_list for k in d.keys()}
list_of_values = [[v for d in dictionary_list for k, v in d.items() if k == key] for key in keys]
Using your example this outputs:
[[17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746],
[0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0],
[2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0],
[4585, 4751, 3214, 5005],
[34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715]]
If you want a new dictionary with the relevant value lists then switch the second line with:
new_dict = {key: [v for d in dictionary_list for k, v in d.items() if k == key] for key in keys}
Upvotes: 0