kalaiyarasi Uma
kalaiyarasi Uma

Reputation: 51

Pandas - How to create new columns and merge back to the existing dataframe

I have dataframe as below

df = pd.DataFrame([[1,5,'Dog'],[2,6,'Dog'],[3,7,'Cat'],[4,8,'Cat']],columns=['A','B','Type'])

Index A B Type
0 1 5 Dog
1 2 6 Dog
2 3 7 Cat
3 4 8 Cat

Based on the 'Type' column value, I need to apply its own function(for example for Dog rows, call its Dog function and get the value populated in the C & D; likewise for Cat type, call its cat function and create C & D column) and create two new columns C and D returned from these functions.

Finally my dataframe should look like the below

Index A B Type C D
0 1 5 Dog Dog1 Value Dog2 Value
1 2 6 Dog Dog1 Value Dog2 Value
2 3 7 Cat Cat1 Value Cat2 Value
3 4 8 Cat Cat1 Value Cat2 Value

Column C and D are values returned from the functions. For examples here I have given like below.

The problem I face here is -

For each type of 'Type' column value, I am filering the rows and calling it's own function and getting the C and D column but when I merge it back into the original dataframe with left_index=True and Right_index =True, it is creating Column_X and Column_Y for all the columns and this is creating problem when I iterate for the next 'Cat' rows. Please advice how shall I approach this problem

Code

def ext_fun(x1,x2,i):
    if i=='Dog':
        #Do some calc to find c and d value and return back

        return ['c','d']
    if i=='Cat':
        #do some calc to find c and d value and return back
        return ['c','d']
    
for i in df['Type'].unique():
    df1 = df[df.Type==i]
    df1[['C','D']] = df1.apply(lambda x: ext_fun(x['A'],x['B'],i),result_type='expand',axis=1)
    df = pd.merge(df,df1,left_index = True,right_index=True)

Note: I have 10 to 15 types in the column 'Type' with hundreds of records in each type. The values for col C and D are dynamic and require a function. So function call is required based on the Type column value.

Upvotes: 1

Views: 102

Answers (1)

imburningbabe
imburningbabe

Reputation: 792

You don't have to split and then re-merge the dataframes, you can use .loc:

df.loc[df['Type'] == 'Dog', 'C'] = 'Dog1 Value'
df.loc[df['Type'] == 'Cat', 'C'] = 'Cat1 Value'

df.loc[df['Type'] == 'Dog', 'D'] = 'Dog2 Value'
df.loc[df['Type'] == 'Cat', 'D'] = 'Cat2 Value'

Sorry for the values, I don't know which value you will use so I fill it with yours

Upvotes: 1

Related Questions