Zanshin
Zanshin

Reputation: 1272

Run func(df) to create new dataframes and rename them

Can I keep the names of df10 & df20 the same and call them individually after running func(df), or even rename them?

df = pd.DataFrame( {
   'A': ['d','d','d','d','d','d','g','g','g','g','g','g','k','k','k','k','k','k'],
   'B': [5,5,6,4,5,6,-6,7,7,6,-7,7,-8,7,-6,6,-7,50],
   'C': [1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2],
   'S': [2012,2013,2014,2015,2016,2012,2012,2014,2015,2016,2012,2013,2012,2013,2014,2015,2016,2014]     
    } );

df10 = (df.B + df.C).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0)

df20 = (df['B'] - df['C']).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0)

def func(df):
    df1 = df.groupby(level=0, axis=1).sum()
    new_cols= list(zip(df1.columns.get_level_values(0),['total'] * len(df.columns)))
    df1.columns = pd.MultiIndex.from_tuples(new_cols)
    df2 = pd.concat([df1,df], axis=1).sort_index(axis=1).sort_index(axis=1, level=1)
    df2.columns = ['_'.join((col[0], str(col[1]))) for col in df2.columns]
    df2.columns = df2.columns.str.replace('sum_','')
    df2.columns = df2.columns.str.replace('size_','T')
    return df2

dfs = [] 
for df in [df10, df20]: 
    dfs.append(func(df))

dfs

Upvotes: 1

Views: 53

Answers (1)

jezrael
jezrael

Reputation: 863166

You can use dict for store and list for create new names of DataFrames stored in dfs:

names = ['a','b']
dfs = {names[i]:func(df) for i,df in enumerate([df10, df20])}
print (dfs)
{'a':    T2012  2012  T2013  2013  T2014  2014  T2015  2015  T2016  2016  Ttotal  \
A                                                                            
d      2    13      1     6      1     7      1     5      1     6       6   
g      2   -11      1     8      1     8      1     8      1     7       6   
k      1    -6      1     9      2    48      1     8      1    -5       6   

   total  
A         
d     37  
g     20  
k     54  , 'b':    T2012  2012  T2013  2013  T2014  2014  T2015  2015  T2016  2016  Ttotal  \
A                                                                            
d      2     9      1     4      1     5      1     3      1     4       6   
g      2   -15      1     6      1     6      1     6      1     5       6   
k      1   -10      1     5      2    40      1     4      1    -9       6   

   total  
A         
d     25  
g      8  
k     30  }
print (dfs['a'])
   T2012  2012  T2013  2013  T2014  2014  T2015  2015  T2016  2016  Ttotal  \
A                                                                            
d      2    13      1     6      1     7      1     5      1     6       6   
g      2   -11      1     8      1     8      1     8      1     7       6   
k      1    -6      1     9      2    48      1     8      1    -5       6   

   total  
A         
d     37  
g     20  
k     54  

print (dfs['b'])
   T2012  2012  T2013  2013  T2014  2014  T2015  2015  T2016  2016  Ttotal  \
A                                                                            
d      2     9      1     4      1     5      1     3      1     4       6   
g      2   -15      1     6      1     6      1     6      1     5       6   
k      1   -10      1     5      2    40      1     4      1    -9       6   

   total  
A         
d     25  
g      8  
k     30  

But if need same names of DataFrames, you can assign output of function func to same variables:

df10 = func(df10)
df20 = func(df20)
print (df10)
   T2012  2012  T2013  2013  T2014  2014  T2015  2015  T2016  2016  Ttotal  \
A                                                                            
d      2    13      1     6      1     7      1     5      1     6       6   
g      2   -11      1     8      1     8      1     8      1     7       6   
k      1    -6      1     9      2    48      1     8      1    -5       6   

   total  
A         
d     37  
g     20  
k     54 
print (df20)
   T2012  2012  T2013  2013  T2014  2014  T2015  2015  T2016  2016  Ttotal  \
A                                                                            
d      2     9      1     4      1     5      1     3      1     4       6   
g      2   -15      1     6      1     6      1     6      1     5       6   
k      1   -10      1     5      2    40      1     4      1    -9       6   

   total  
A         
d     25  
g      8  
k     30  

You can assign to new variables also:

dfa = func(df10)
dfb = func(df20)

Upvotes: 1

Related Questions