Andrea P.
Andrea P.

Reputation: 72

Nested Dynamic Variables for Dataframes in Loop

I have multiple pandas dataframe, with same columns but different values. Ind I need to run an analysis from values of specific columns.

I have 7 dataframes to work with, but let's suppose I had only two.

df1 = pd.DataFrame({'a': [0, 0.5, 0.2],
                   'b': [1,1,0.3], 'c':['A','A','B']})

df2 = pd.DataFrame({'a': [4, 1, 6],
                   'b': [6.2,0.3,0.3], 'c': ['B','A','A']})

I opted to use global variables in a for loop.

I created:

Data need to be taken out from df in dflist, elaborated and finally will be passed on sumlist.

To do not get lost, I want my dynamic variables to get names from values in sumlist.

Here's where I get stuck. The variables I want to create are based on columns of dataframes df1, df2. However the output for each dynamic variable will contain all values from all columns.

dflist= [df1, df2]
sumlist= ['name1', 'name2']

for i in dflist:
    for name in sumlist:
        globals()['var{name}'] = i['c'].to_list()

On this dummy example, for some reasons, I get the following error:

varname1
NameError: name 'varname1' is not defined

In the case of the original dataframe, my list varname1 will give the following result:

['A','A','B','B','B','A']

Instead I should have had:

varname1 = ['A','A','B']
varname2 = ['B','B','A']

What puzzles me is that with the very same code, it "works" (albeit wrongly) in a case while it gives error in the other.

I need to overcome the issue or I will be forced to manually write every single variable.

Upvotes: 1

Views: 759

Answers (3)

Ulewsky
Ulewsky

Reputation: 329

I think you have an error in your dummy example because you do not have f before '' for the F-string.

It should be like this:

globals()[f'var{name}'] = i['c'].to_list()

Upvotes: 1

U13-Forward
U13-Forward

Reputation: 71610

Well, my suggestion would be to use a dictionary instead of using an unsafe globals command. So instead of:

for i in dflist:
    for name in sumlist:
        globals()['var{name}'] = i['c'].to_list()

You should do:

d = {}
for i, name in zip(dflist, sumlist):
    d[f'var{name}'] = i['c'].tolist()

Notice I am using a zip function to iterate the two lists in parallel.

Upvotes: 2

Peter Badida
Peter Badida

Reputation: 12189

You are missing f for the F-string.

    globals()['var{name}'] = i['c'].to_list()

vs

    globals()[f'var{name}'] = i['c'].to_list()

therefore your global variable is being overwritten and named as var{name} instead of varname1.

Also, better use dictionary instead of globals().

Upvotes: 1

Related Questions