Ilya Dachkovsky
Ilya Dachkovsky

Reputation: 405

Applying function to a DataFrame using its columns as parameters

I need to use a function to calculate a new column for a table using existing data from its 4 columns.

Suppose I have a function that calculates orders, impressions, or clicks - anytging from different sources. Something like this:

def claculate_new_columns(complete_orders_a, total_a, completed_orders_b, total_b):
    total = 0.0

    #just some random calculations bellow - not important

    source_a = complete_dorders_a + 1
    test_a = total_a  + 1
    source_b = completed_orders_b + 1
    test_b = total_b + 1

    for i in something(smth):
        total += source_a*test_a*source_b*test_b
    return total 

How do I use it with data from DataFrame columns?

I want to run over rows in columns and insert the results in a new column. Something like this (it doesn't work, obviously):

old_df['new_column'] = old_df.apply(claculate_new_columns(column1,column2,column3,column4))

Would be glad for a correct way to apply such functions to a DataFrame and use these DataFrame columns as function's arguments. What is the correct syntax?

Solutions from StackOverflow don't work for me probably because I searched for wrong answers.

Upvotes: 1

Views: 52

Answers (2)

abu8na9
abu8na9

Reputation: 103

To do calculations between columns and create a new column inside a function use apply with axis = 1

For example:

df = pd.DataFrame({'column_1':[1,2,3,4,5], 
                   'column_2':[10,20,30,40,50]})

def func(df):
    #     All Calculations here
    df['new_column'] = df['column_1'] + df['column_2']
    return df

df.apply(func, axis=1)

results

    column_1    column_2    new_column
 0      1         10            11
 1      2         20            22
 2      3         30            33
 3      4         40            44
 4      5         50            55

Upvotes: 1

RJ Adriaansen
RJ Adriaansen

Reputation: 9619

Use a lambda function:

old_df['new_column'] = old_df.apply(lambda row: claculate_new_columns(row['column1'], row['column2'], row['column3'], row['column4']), axis=1)

Upvotes: 2

Related Questions