user10777757
user10777757

Reputation: 23

A list of dictionaries want to get each value and put them into a separate list?

The output of my code gives the following:

[{'Total Population:': 4585, 'Total Water Ice Cover': 2.848142234497044, 'Total Developed': 17.205368316575324, 'Total Barren Land': 0.22439908514219134, 'Total Forest': 34.40642126612868},

 {'Total Population:': 4751, 'Total Water Ice Cover': 1.047783534830167, 'Total Developed': 37.27115716753022, 'Total Barren Land': 0.11514104778353484, 'Total Forest': 19.11341393206678},

 {'Total Population:': 3214, 'Total Water Ice Cover': 0.09166603009701321, 'Total Developed': 23.50469788404247, 'Total Barren Land': 0.2597204186082041, 'Total Forest': 20.418608204109695},

 {'Total Population:': 5005, 'Total Water Ice Cover': 0.0, 'Total Developed': 66.37545713124746, 'Total Barren Land': 0.0, 'Total Forest': 10.68671271840715},

...
]

What I'd like to be able to do is get the all the values for 'Total Population' and store that in one list. Then get all the 'Total Water Ice Cover' and store that in another list, and so on. With a data structure like this how does out extract out these values and store them into separate lists?

Thank you

Upvotes: 2

Views: 121

Answers (5)

pault
pault

Reputation: 43524

If your goal is to calculate Pearson's correlation, you should use pandas for this.

Suppose your original list of dictionaries was stored in a variable called output. You can easily convert it into a pandas DataFrame using:

import pandas as pd
df = pd.DataFrame(output)
print(df)
#   Total Barren Land  Total Developed  Total Forest  Total Population:  Total Water Ice Cover
#0           0.224399        17.205368     34.406421               4585               2.848142 
#1           0.115141        37.271157     19.113414               4751               1.047784 
#2           0.259720        23.504698     20.418608               3214               0.091666   
#3           0.000000        66.375457     10.686713               5005               1.047784 

Now you can easily generate a correlation matrix:

# this is just to make the output print nicer
pd.set_option("precision",4)  # only show 4 digits

# remove 'Total ' from column names to make printing smaller
df.rename(columns=lambda x: x.replace("Total ", ""), inplace=True)  

corr = df.corr(method="pearson")
print(corr)
#                 Barren Land  Developed  Forest  Population:  Water Ice Cover
#Barren Land           1.0000    -0.9579  0.7361      -0.7772           0.4001
#Developed            -0.9579     1.0000 -0.8693       0.5736          -0.6194
#Forest                0.7361    -0.8693  1.0000      -0.1575           0.9114
#Population:          -0.7772     0.5736 -0.1575       1.0000           0.2612
#Water Ice Cover       0.4001    -0.6194  0.9114       0.2612           1.0000

Now you can access individual correlations by key:

print(corr.loc["Forest", "Water Ice Cover"])
#0.91135717479534217

Upvotes: 2

Daniel Pryden
Daniel Pryden

Reputation: 60997

If all the dicts have the same keys, then you can just use the keys of the first dict:

result = {k:[d[k] for d in dictionary_list] for k in dictionary_list[0].keys()} 

If the dicts could have different sets of keys, but you're OK with lists of different lengths, I would use a defaultdict to simplify:

from collections import defaultdict
result = defaultdict(list)
for d in dictionary_list:
    for k, v in d.items():
        result[k].append(v)

If the dicts could have different sets of keys, and you want all the lists to be the same length, then you'll need to iterate twice. You'll also need some kind of placeholder value to use for when the key is missing. If we want to use None for that, we can do:

placeholder = None
keys = set()
for d in dictionary_list:
    keys += set(d.keys())
result = {k:[] for k in keys}
for d in dictionary_list:
    for k in keys:
        result[k].append(d.get(k, placeholder))

In each case result is a dict of lists. If you want a list of lists it's actually even simpler:

result = [[d[k] for d in dictionary_list] for k in dictionary_list[0].keys()]

If you want all the lists to be the same length and include placeholders then you'll still need to use a dict of lists as an intermediate step. But it's easy to transform from a dict of lists to a list of lists of values:

list_of_lists_of_values = list(dict_of_lists_of_values.values())

That said, prior to Python 3.7, dictionaries didn't have a well-defined iteration order, so you're probably better off using a dictionary anyway, because otherwise it's hard to be certain you're getting the right values (e.g. "Total Population" isn't guaranteed to be the first series of values).

Upvotes: 0

rahlf23
rahlf23

Reputation: 9019

You could use pandas:

pd.DataFrame(my_dict).to_dict(orient='list')

Returns:

{'Total Barren Land': [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0],
'Total Developed': [17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746],
'Total Forest': [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715],
'Total Population:': [4585, 4751, 3214, 5005],
'Total Water Ice Cover': [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0]}

Upvotes: 1

Pedro Lobito
Pedro Lobito

Reputation: 99001

I guess you can use something like:

d = [{'Total Population:': 4585, 'Total Water Ice Cover': 2.848142234497044, 'Total Developed': 17.205368316575324, 'Total Barren Land': 0.22439908514219134, 'Total Forest': 34.40642126612868},
 {'Total Population:': 4751, 'Total Water Ice Cover': 1.047783534830167, 'Total Developed': 37.27115716753022, 'Total Barren Land': 0.11514104778353484, 'Total Forest': 19.11341393206678},
 {'Total Population:': 3214, 'Total Water Ice Cover': 0.09166603009701321, 'Total Developed': 23.50469788404247, 'Total Barren Land': 0.2597204186082041, 'Total Forest': 20.418608204109695},
 {'Total Population:': 5005, 'Total Water Ice Cover': 0.0, 'Total Developed': 66.37545713124746, 'Total Barren Land': 0.0, 'Total Forest': 10.68671271840715}]

f = {}
for l in d:
    for k, v in l.items():
        if not k in f:
            f[k] = []
        f[k].append(v)
print(f)

{'Total Population:': [4585, 4751, 3214, 5005], 'Total Water Ice Cover': [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0], 'Total Developed': [17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746], 'Total Barren Land': [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0], 'Total Forest': [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715]}

Python Demo

Upvotes: 1

ShlomiF
ShlomiF

Reputation: 2905

Call your list of dictionaries dictionary_list. Then:

keys = {k  for d in dictionary_list for k in d.keys()}
list_of_values = [[v for d in dictionary_list for k, v in d.items() if k == key] for key in keys]

Using your example this outputs:

[[17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746],
 [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0],
 [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0],
 [4585, 4751, 3214, 5005],
 [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715]]

If you want a new dictionary with the relevant value lists then switch the second line with:

new_dict = {key: [v for d in dictionary_list for k, v in d.items() if k == key] for key in keys}

Upvotes: 0

Related Questions