Reputation: 60
So I need help looping thru a nested dictionaries that i have created in order to answer some problems. My code that splits up the 2 different dictionaries and adds items into them is as follows: Link to csv : https://docs.google.com/document/d/1v68_QQX7Tn96l-b0LMO9YZ4ZAn_KWDMUJboa6LEyPr8/edit?usp=sharing
import csv
region_data = {}
country_data = {}
answers = []
data = []
cuntry = False
f = open('dph_SYB60_T03_Population Growth, Fertility and Mortality Indicators.csv')
reader = csv.DictReader(f)
for line in reader:
#This gets all the values into a standard dict
data.append(dict(line))
#This will loop thru the dict and create variables to hold specific items
for i in data:
# collects all of the Region/Country/Area
location = i['Region/Country/Area']
# Gets All the Years
years = i['Year']
i_d = i['ID']
info = i['Footnotes']
series = i['Series']
value = float(i['Value'])
# print(series)
stats = {i['Series']:i['Value']}
# print(stats)
# print(value)
if (i['ID']== '4'):
cuntry = True
if cuntry == True:
if location not in country_data:
country_data[location] = {}
if years not in country_data[location]:
country_data[location][years] = {}
if series not in country_data[location][years]:
country_data[location][years][series] = value
else:
if location not in region_data:
region_data[location] = {}
if years not in region_data[location]:
region_data[location][years] = {}
if series not in region_data[location][years]:
region_data[location][years][series] = value
When I print the dictionary region_data output is:
For Clarification What is shown is a "Region" as a key in a dict. The years being Values and keys in that 'Region's Dict and so on so forth....
I want to understand how i can loop thru the data and answer a question like :
Which region had the largest numeric decrease in Maternal mortality ratio from 2005 to 2015?
Were "Maternal mortality ratio (deaths per 100,000 population)" is a key within the dictionary.
Upvotes: 0
Views: 216
Reputation: 570
If you prefer to loop throught dictionaries in Python 3.x you can use the method .items() from each dictionary and nest them with three loops.
With a main dictionary called hear dict_total, this code will work it.
out_region = None
out_value = None
sel_serie = 'Maternal mortality ratio (deaths per 100,000 population)'
min_year = 2005
max_year = 2015
for reg, dict_reg in dict_total.items():
print(reg)
for year, dict_year in dict_reg.items():
if min_year <= year <= max_year:
print(year)
for serie, value in dict_year.items():
if serie == sel_serie and value is not None:
print('{} {}'.format(serie, value))
if out_value is None or out_value < value:
out_value = value
out_region = reg
print('Region: {}\nSerie: {} Value: {}'.format(out_region, sel_serie, out_value))
Upvotes: 1
Reputation: 570
Use pandas for that and read your file accordint to this answer.
import pandas as pd
filename = 'dph_SYB60_T03_Population Growth, Fertility and Mortality Indicators.csv'
df = pd.read_csv(filename)
Then you can make a pivot for "'Region/Country/Area'" and "Series" and use as a aggregate function "max".
pivot = df.pivot_table(index='Region/Country/Area', columns='Series', values='Value', aggfunc='max')
Then sort your "pivot table" by a series name and use the argument "ascending"
df_sort = pivot.sort_values(by='Maternal mortality ratio (deaths per 100,000 population)', ascending=False)
Finally you will have the answer to your question.
df_sort['Maternal mortality ratio (deaths per 100,000 population)'].head(1)
Region/Country/Area
Sierra Leone 1986.0
Name: Maternal mortality ratio (deaths per 100,000 population), dtype: float64
Warning: Some of your regions have records before 2005, so you should filter your data only for values between 2005 and 2015.
Upvotes: 1