S Hendricks
S Hendricks

Reputation: 119

create new dataframe according to a dictionary using for loop

I am trying to create multiple dataframe with similar names. The names change based on a list, and also join an operating.

corr_C=train[train_C].apply(lambda x: x.corr(train['target'])).abs() 
corr_C=corr_C.sort_values(ascending=False, inplace=True)

I have datasets of train_D, train_E and train_F, I want to apply the same function to those datasets accordingly.

Available solutions online only talk about loop across columns, but I need the function being changeable to create new dataframes.

list=['C','D','E','F']
for list in list:
corr_+list=train[train_list].apply(lambda x: x.corr(train['target'])).abs() 
return corr_+list=corr_list.sort_values(ascending=False, inplace=True)

SyntaxError: invalid syntax

Upvotes: 0

Views: 367

Answers (2)

S Hendricks
S Hendricks

Reputation: 119

I eventually sorted it out:

col={'C','D','E','F'}
for col in col:
    dfname=('corr_'+col)
    dfnew=train['train'+col].apply(lambda x: x.corr(train['target'])).abs() 
    locals() [dfname]=dfnew

Upvotes: 0

Barry
Barry

Reputation: 11

You could make a dictionary that you then populate with the key (name) and value (dataframe). This is what I usually do.

#use pandas for dataframe and numpy for random
import pandas as pd
import numpy as np

#some random array data to turn into pd dataframes
my_arrays = []
for i in range(0, 3):
    my_arrays.append(np.random.randint(10, size=(5,5)))

#some array names (this could be done more programmatically)
my_array_names = ["First", "Second", "Third"]

#make a dictionary
d = {}
for i in range(0, len(my_arrays)):
    #Populate dictionary --> d[key]=value
    d[my_array_names[i]]=pd.DataFrame(my_arrays[i], columns = ['C1', 'C2', 'C3', 'C4', 'C5'], index = ['R1', 'R2', 'R3', 'R4', 'R5'])

#print them out to take a look
for key, value in d.items():
    print(key)
    print(value)

#or call individually
#print(d["First"])

Upvotes: 0

Related Questions