Use a function to return multiple column outputs from specific column inputs using Pandas

Question

I would like to add two new columns to my dataframe by applying a function that takes inputs from multiple, specific pre-existing columns.

Here is my approach which works for returning one column, but not multiple:

Here is my DataFrame:

d = {'a': [3,0,2,2],
    'b': [0,1,2,3],
    'c': [1,1,2,3],
    'd': [2,2,1,3]}

df = pd.DataFrame(d)

I'm trying to apply this function:

def myfunc(a,b,c):
    if a > 2 and b > 2:
        print('condition 1',a,b)
        return pd.Series((a,b))
    elif a < 2 and c < 2:
        print('condition 2',a,c)
        return pd.Series((b,c))
    else:
        print('no condition')
        return pd.Series((None,None))

Like this:

df['e'],df['f'] = df.apply(lambda x: myfunc(x['a'],x['b'],x['c']),axis=1)

Output:

no condition
no condition
condition 2 0 1
no condition
no condition

DataFrame result:

How can I input multiple columns and get multiple columns out?

piRSquared · Accepted Answer

The issue is with the assignment, not myfunc

When you try to unpack a dataframe as a tuple, it returns the column lables. That's why you get (0, 1) for everything

df['e'], df['f'] = pd.DataFrame([[8, 9]] * 1000000, columns=['Told', 'You'])
print(df)

   a  b  c  d     e    f
0  3  0  1  2  Told  You
1  0  1  1  2  Told  You
2  2  2  2  1  Told  You
3  2  3  3  3  Told  You

Use join

df.join(df.apply(lambda x: myfunc(x['a'],x['b'],x['c']),axis=1))

Or pd.concat

pd.concat([df, df.apply(lambda x: myfunc(x['a'],x['b'],x['c']),axis=1)], axis=1)

both give

   a  b  c  d    e    f
0  3  0  1  2  NaN  NaN
1  0  1  1  2  1.0  1.0
2  2  2  2  1  NaN  NaN
3  2  3  3  3  NaN  NaN

Use a function to return multiple column outputs from specific column inputs using Pandas

Answers (2)

Related Questions