Kyle
Kyle

Reputation: 387

Python Pandas Dynamically Create a Dataframe

The code below will generate the desired output in ONE dataframe, however, I would like to dynamically create data frames in a FOR loop then assign the shifted value to that data frame. Example, data frame df_lag_12 would only contain column1_t12 and column2_12. Any ideas would be greatly appreciated. I attempted to dynamically create 12 dataframes using the EXEC statement, google searching seems to state this is poor practice.

import pandas as pd
list1=list(range(0,20))
list2=list(range(19,-1,-1))
d={'column1':list(range(0,20)),
   'column2':list(range(19,-1,-1))}
df=pd.DataFrame(d)
df_lags=pd.DataFrame()
for col in df.columns:
    for i in range(12,0,-1):
        df_lags[col+'_t'+str(i)]=df[col].shift(i)
    df_lags[col]=df[col].values  
print(df_lags)
for df in (range(12,0,-1)):
    exec('model_data_lag_'+str(df)+'=pd.DataFrame()')

Desired output for dymanically created dataframe DF_LAGS_12:

var_list=['column1_t12','column2_t12']
df_lags_12=df_lags[var_list]  
print(df_lags_12)

Upvotes: 6

Views: 27048

Answers (2)

velusamy subbarayalu
velusamy subbarayalu

Reputation: 21

for i in range(1, 16):
    text=f"Version{i}=pd.DataFrame()"
    exec(text)

A combination of exec and f"..." will help you do that. If you need iterating or Versions of same variable above statement will help

Upvotes: 2

jezrael
jezrael

Reputation: 862511

I think the best is create dictionary of DataFrames:

d = {}
for i in range(12,0,-1):
    d['t' + str(i)] = df.shift(i).add_suffix('_t' + str(i))

If need specify columns first:

d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
    d['t' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))

dict comprehension solution:

d = {'t' + str(i): df.shift(i).add_suffix('_t' + str(i)) for i in range(12,0,-1)}

print (d['t10'])
    column1_t10  column2_t10
0           NaN          NaN
1           NaN          NaN
2           NaN          NaN
3           NaN          NaN
4           NaN          NaN
5           NaN          NaN
6           NaN          NaN
7           NaN          NaN
8           NaN          NaN
9           NaN          NaN
10          0.0         19.0
11          1.0         18.0
12          2.0         17.0
13          3.0         16.0
14          4.0         15.0
15          5.0         14.0
16          6.0         13.0
17          7.0         12.0
18          8.0         11.0
19          9.0         10.0

EDIT: Is it possible by globals, but much better is dictionary:

d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
    globals()['df' + str(i)] =  df[cols].shift(i).add_suffix('_t' + str(i))

print (df10)
    column1_t10  column2_t10
0           NaN          NaN
1           NaN          NaN
2           NaN          NaN
3           NaN          NaN
4           NaN          NaN
5           NaN          NaN
6           NaN          NaN
7           NaN          NaN
8           NaN          NaN
9           NaN          NaN
10          0.0         19.0
11          1.0         18.0
12          2.0         17.0
13          3.0         16.0
14          4.0         15.0
15          5.0         14.0
16          6.0         13.0
17          7.0         12.0
18          8.0         11.0
19          9.0         10.0

Upvotes: 13

Related Questions