How to create a new pandas dataframe from old dataframe using a list of column names

Question

I have a pandas dataframe with several columns. Bulk of the column names can be looped. So I have made an array of the column names like this:

ycols = ['{}_{}d pred'.format(ticker, i) for i in range(hm_days)]

Now I want to make a new pandas dataframe with only these columns having the index of the parent dataframe. How to do this?

Chuck · Accepted Answer

Ok, So you want to create a new dataframe with new column names, with the existing index of the original dataframe.

For some dataframe:

old_df = pd.DataFrame({'x':[0,1,2,3],'y':[10,9,8,7]})
>>>
   x   y
0  0  10
1  1   9
2  2   8
3  3   7

columns = list(old_df)
>>>
['x', 'y']

You can specify your new columns by doing:

y_cols = ['x_pred','y_pred']
>>> ['x_pred','y_pred']

Here, y_cols is the list of your new column names. In your code, you would replace this step with ycols = ['{}_{}d pred'.format(ticker, i) for i in range(hm_days)].

To get the new columns, you create new columns with a placeholder variable (in this case 0, as it looks like you are using numeric data), with the same index as your old dataframe:

# Iterate over all columns names in y_cols
for i in y_cols:
    old_df[i]=0
>>> old_df:
   x   y  x_pred  y_pred
0  0  10       0       0
1  1   9       0       0
2  2   8       0       0
3  3   7       0       0

Finally, slice your dataframe to get your new dataframe with new column names, maintaining the index of the old dataframe.

df_new = old_df[y_cols]
>>>
   x_pred  y_pred
0       0       0
1       0       0
2       0       0
3       0       0

This works even if you have a named index:

      x   y  x_pred  y_pred
Date                       
0     0  10       0       0
1     1   9       0       0
2     2   8       0       0
3     3   7       0       0
df_new = old_df[y_cols]
      x_pred  y_pred
Date                
0          0       0
1          0       0
2          0       0
3          0       0

How to create a new pandas dataframe from old dataframe using a list of column names

Answers (1)

Related Questions