Caiotru
Caiotru

Reputation: 325

Python loop a dictionary with list values

I have the following dataframe:

    data = {'state': ['Rome', 'Venice', 'NY', 'Boston', 'London', 'Bristol'],
    'year': [2000, 2001, 2002, 2001, 2003, 2003],
    'number': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

df = pd.DataFrame(data)

and created a dictionary as per below:

dic = {
    'it':['Rome', 'Venice'], 
    'UK':['London', 'Bristol'],
    'US':['NY', 'Boston']
}

Is there a way to iterate through the dictionary, groupby year, find the mean number and create new dataframes named as the keys in the dictonary.

I have tried something like this but its not working....

for x, y in dic.items():
    x = df[df['state'].isin(y)].groupby(['year'], as_index=False)['numer'].mean()

For example the expected output for UK for would be the below:

UK

    year    number
0   2003    3.05

Upvotes: 1

Views: 70

Answers (2)

A simplier way is to create a mapping with the continent as the value in the key/value pair. Then replace the mapping of the state column into the continent column. Last use the groupby function on continient and year and output the mean of the number column

data = {'state': ['Rome', 'Venice', 'NY', 'Boston', 'London', 'Bristol'],
'year': [2000, 2001, 2002, 2001, 2003, 2003],
'number': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

mapping = {
'Rome':'it', 
'Venice':'it',
'London':'UK',
'Bristol':'UK',
'NY':'US',
'Boston':'US'
}

df = pd.DataFrame(data)
df['continent']=df['state'].replace(mapping)
print(df.head())
print(df.groupby(['continent','year'])['number'].mean())

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195543

You code is almost correct, just a typo in numer and store the results in dictionary:

import pandas as pd

data = {'state': ['Rome', 'Venice', 'NY', 'Boston', 'London', 'Bristol'],
    'year': [2000, 2001, 2002, 2001, 2003, 2003],
    'number': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

dic = {
    'it':['Rome', 'Venice'],
    'UK':['London', 'Bristol'],
    'US':['NY', 'Boston']
}

df = pd.DataFrame(data)

out = {}
for x, y in dic.items():
    out[x] = df[df['state'].isin(y)].groupby(['year'], as_index=False)['number'].mean()

for country, df in out.items():
    print(country)
    print(df)
    print('-' * 80)

Prints:

it
   year  number
0  2000     1.5
1  2001     1.7
--------------------------------------------------------------------------------
UK
   year  number
0  2003    3.05
--------------------------------------------------------------------------------
US
   year  number
0  2001     2.4
1  2002     3.6
--------------------------------------------------------------------------------

Upvotes: 2

Related Questions