Reputation: 1
I am exploring basics of pandas and I am working on an assignment that I've found.
I have created a list, which contains names for future DataFrames in Pandas. That's the list:
DF_names_by_year = ['year_1985', 'year_1986', 'year_1987', ..., 'year_2010', 'year_2011', 'year_2012', 'year_2013']
I have a big df with information regarding each year from the list. Now I have to make a graph to show some of the information in each year. I want to group the df by year and cut it and give each new df name appropriately from the list of names.
It works if I type the name:
year_1985 = pd.DataFrame(teams_wins_salaries.loc[teams_wins_salaries['yearID'] == 1985])
but if I put it into a loop, I actually make a list of empty dataframes.
for i in range(len(DF_names_by_year)):
DF_names_by_year[i] = pd.DataFrame(teams_wins_salaries.loc[teams_wins_salaries['yearID'] == i])
[Empty DataFrame
Columns: [yearID, teamID, W, salary]
Index: [], Empty DataFrame
Columns: [yearID, teamID, W, salary]
Index: [], Empty DataFrame
Columns: [yearID, teamID, W, salary]
My intuition tells me that there should be a way to separate the df and give a name to each part. I only wonder if it is possible to give them names from the list.
I would be grateful for any ideas on how to solve the problem.
Upvotes: 0
Views: 55
Reputation: 107567
Consider groupby
to split your data frame by all unique years. Also, consider using a list or dictionary of data frames instead of flooding your global envirobment with many similar-structured objects.
# LIST COMPREHENSION
year_df_list = [g for i,g in teams_wins_salaries.groupby('yearID')]
# DICTIONARY COMPREHENSION
year_df_dict = {i:g for i,g in teams_wins_salaries.groupby('yearID')}
You lose no functionality of the data frame if it is stored in a list or dict. So instead of maintaining 30+ separate, isolated, named global items you maintain one that can be traversed, looped, graphed easily and harmoniously:
year_df_list[1]·head()
year_df_list[2].describe()
year_df_list[3].shape
year_df_dict['1985']·head()
year_df_dict['1990'].describe()
year_df_dict['1995'].shape
Upvotes: 1