create multiple new data frames in a loop by subsetting another and adding a suffix

Question

I have a bunch of dataframes that I would like to use to create new dataframes with by subsetting a fixed list of columns and adding a suffix to. Thus far I make a list of all my dataframes:

buckets_list = ['one_season_bucket_year1',
'two_season_bucket_year1',
'two_season_bucket_year2',
'three_season_bucket_year1',
'three_season_bucket_year2',
'three_season_bucket_year3',
'four_season_bucket_year1',
'four_season_bucket_year2',
'four_season_bucket_year3',
'five_season_bucket_year1',
'five_season_bucket_year2',
'five_season_bucket_year3',
'five_season_bucket_year4',
'five_season_bucket_year5']

I have a list of all the columns I want to subset:

player_bio_list = ['games',
'height',
'position',
'minutes']

and I try to make a for loop to make a new dataframe:

for bucket in buckets_list:
    vars()[str(bucket) + "player_bio"]= pd.DataFrame(bucket[player_bio_list])

but I get an error saying "TypeError: string indices must be integers", what am I missing here? I have googled that error with little success since it happens a lot of reasons

Rui Bastos · Accepted Answer

Your buckets_list is simply a list of strings and not of dataframes. And if you really want to keep the names of the variables and add a prefix, have buckets_list be a dictionary in the following format: {dataframe_variable_name: dataframe_object, and iterate through it by using for key, val in buckets_list.items():, where the key will be the variable name and val will be the dataframe object.

EDIT: for clarification and answering the OP's comment

buckets_list = {dataframe1_name: dataframe1,
                dataframe2_name: dataframe2,
                ...}

Just fill it up with all the dataframes you want...

create multiple new data frames in a loop by subsetting another and adding a suffix

Answers (1)

Related Questions