MichaelA
MichaelA

Reputation: 1986

Use transform to calculate a value from two different columns

I'd like to apply a small function that uses two parameters on one data frame using the transform function.

Consider this rather useless example function:

import pandas as pd

def example_function(x, y):
    if y=="hi":
        res = x*3
    else:
        res = x
    return res

Depending on the value in y ("hi" or something else) the value x will bu multiplied by 3 or returned unaltered.

Given this example Dataframe


df = pd.DataFrame(dict([("A",[1,2,3,4]), ("B",["hi", "ho", "ho", "hi"])]))

I'd like to get this result:

    A   B   C
0   1   hi  3
1   2   ho  2
2   3   ho  3
3   4   hi  12

I assumed that passing two columns should work:

df["combined"] = df[["A", "B"]].transform(example_function)

but I'm getting an error (Missing 1 required positional argument). Any suggestion how to solve this?

Upvotes: 1

Views: 87

Answers (1)

jezrael
jezrael

Reputation: 862801

It is not possible, because transform processing each column separately, so cannot filtering between columns (Series).

Solution with DataFrame.apply working like you need:

df["combined"] = df.apply(lambda x: example_function(x.A, x.B), axis=1)
print (df)
   A   B  combined
0  1  hi         3
1  2  ho         2
2  3  ho         3
3  4  hi        12

You can check it with this function:

def function(x):
    print (x)
    return x

df[["A", "B"]].transform(function)

Upvotes: 2

Related Questions