Denis
Denis

Reputation: 99

Loop over multiple Dataframes

I have to resample a bunch of dataframes. My dataframes are called simply df_1, df_2 and so on (I have about 50 of them). I can easily resample each one separately, this way:

df_out_1 = resample(df_1, replace=False, n_samples=50, random_state=11) 
df_out_2 = resample(df_2, replace=False, n_samples=50, random_state=11) 
....

It works, but it's not very intelligent to write 50 almost same lines of code. So I tried a loop:

df_list=[('df_'+str(i),'df_out_'+str(i)) for i in range(1,52)]
for (df,df_out) in df_list: 
    # Downsample majority class
    df_out = resample(df, replace=False, n_samples=50, random_state=11) 

It doesn't work because for python df and df_out in the loop are not dataframes but strings. I don't know how I can cure it. :(

Thanks in advance, D.

Upvotes: 0

Views: 72

Answers (1)

Patrick
Patrick

Reputation: 334

Use globals()[string] to reference the variable named in the string

Full code:

df_list=[('df_'+str(i),'df_out_'+str(i)) for i in range(1,52)]
for (df_str,df_out_str) in df_list:
    df = globals()[df_str]
    df_out = globals()[df_out_str]
    # Downsample majority class
    df_out = resample(df, replace=False, n_samples=50, random_state=11) 

Upvotes: 1

Related Questions