DLB
DLB

Reputation: 101

Create new columns for certain columns pandas

I am creating small dataframes from a larger dataframe. From the larger I am grabbing columns that contain a certain string lets say 'aa'. Now in the smaller df I want to create a new column for each of those. So for each 'aa' col, I want to add '_goal' so aa2, aa7, create aa2_goal, aa7_goal for scoring, and it has to be non specific since this can apply to many smaller df's with many different column names -but they all contain a certain 'str'.

df before--

name    area    aa2 ab1 aa7 ac3 time    type  
CAN 11  0.5 1.2 0.4 2.1 7:21    H  
SPA 22  0.4 1.4 0.5 2.5 6:45    M  
USP 21  0.7 1.1 0.6 2.5 3:14    G  
COM 13  0.1 1.9 0.2 2.2 8:22    D  
MAP 16  0.3 1.8 0.1 2.4 3:11    S  

df after

name    area    aa2 ab1 aa7 ac3 time    type    aa2_new aa7_new  
CAN 11  0.5 1.2 0.4 2.1 7:21    H           
SPA 22  0.4 1.4 0.5 2.5 6:45    M           
USP 21  0.7 1.1 0.6 2.5 3:14    G         
COM 13  0.1 1.9 0.2 2.2 8:22    D         
MAP 16  0.3 1.8 0.1 2.4 3:11    S   

--my attempt

for col in df:
    if 'aa' in df.columns:
        df[col+'_new']
print df

--then the next step will be to import a value into these _goal columns from a different df as well --thanks

Upvotes: 3

Views: 70

Answers (2)

jpp
jpp

Reputation: 164623

You can avoid explicit for loops by filtering for the necessary columns and then using pd.DataFrame.join to join an empty dataframe:

new_cols = df.columns[df.columns.str.startswith('aa')] + '_new'
df = df.join(pd.DataFrame(columns=new_cols))

print(df)

  name  area  aa2  ab1  aa7  ac3  time type aa2_new aa7_new
0  CAN    11  0.5  1.2  0.4  2.1  7:21    H     NaN     NaN
1  SPA    22  0.4  1.4  0.5  2.5  6:45    M     NaN     NaN
2  USP    21  0.7  1.1  0.6  2.5  3:14    G     NaN     NaN
3  COM    13  0.1  1.9  0.2  2.2  8:22    D     NaN     NaN
4  MAP    16  0.3  1.8  0.1  2.4  3:11    S     NaN     NaN

The problem with your code is you do not assign a value to your series, and this is what tells pandas to create a new column.

Your subsequent question should be asked separately, if it hasn't already been answered elsewhere.

Upvotes: 2

Ben.T
Ben.T

Reputation: 29635

to answer on the creation of columns depending on if they contain a sub string like 'aa', you can do:

for col in df.columns: # iterate over columns' names
    if 'aa' in col:
        df[col+'_goal'] = None # fill the column with None
        # or df[col+'_goal'] = '' if you want empty string in the column you create

For what you call the next step, it's too broad to give an anwser, you can do something like df['aa2_goal'] =another_df['another_col']

Upvotes: 0

Related Questions