pang2016
pang2016

Reputation: 539

pandas apply a new column

I have a dataframe like this:

import pandas as pd
import numpy as np
df=pd.DataFrame({'c1':[1,2,4,5],
                'c2':[3,'P','N','T'],
                'c3':np.nan})

the df:

   c1   c2  c3
0   1   3   NaN
1   2   P   NaN
2   4   N   NaN
3   5   T   NaN

I want to change the c3 value based on c2 columns:

the result I wanted:

    c1  c2  c3
 0  1   3   NaN
 1  2   P   1.0
 2  4   N   3.0
 3  5   T   5.0

I use the concat to get this result:

df1=df[df.c2 == 'P']
df1['c3'] =1
df2=df[df.c2 == 'N']
df2['c3'] =3
df3=df[df.c2 == 'T']
df3['c3'] =5
df4=df[(df.c2 != 'N') & (df.c2 != 'P') & (df.c2 != 'T')]
new_df=pandas.concat([df1,df2,df3,df4]).reset_index()
new_df[['c1','c2','c3']]

I want to use apply function to get the same result. I always replace the whole c3 columns when I use apply function:

def new_col(x,df):

    if x== 'P':
        df['c3'] = 1
    elif x == 'N':
        df['c3'] = 3
    elif x == 'T':
        df['c3'] =5
    else:
        df['c3']=np.nan
df.c2.apply(new_col,df=df)
df

How I change the new_col function ?

Upvotes: 1

Views: 296

Answers (1)

jezrael
jezrael

Reputation: 862641

You can use:

def new_col(x):
    a = np.nan
    if x == 'P':
        a = 1
    elif x == 'N':
        a = 3
    elif x == 'T':
       a = 5
    return a

df['c3'] = df.c2.apply(new_col)
print (df)
   c1 c2   c3
0   1  3  NaN
1   2  P  1.0
2   4  N  3.0
3   5  T  5.0

Another solution:

df.loc[df.c2 == 'P', 'C3'] = 1
df.loc[df.c2 == 'N', 'C3'] = 3
df.loc[df.c2 == 'T', 'C3'] = 5
print (df)
   c1 c2  c3   C3
0   1  3 NaN  NaN
1   2  P NaN  1.0
2   4  N NaN  3.0
3   5  T NaN  5.0

Upvotes: 1

Related Questions