Pepacz
Pepacz

Reputation: 959

Pandas - on each column apply a function returning multiple values

I have a dataframe

>>>df = pd.DataFrame({"a" : [1,2,3], "b" :[4,5,6], "c":[10,11,12]})
   a  b   c
0  1  4  10
1  2  5  11
2  3  6  12

and a function that returns multiple values.

>>>def my_fun(values):
>>>    return(values+10, values*3)

It works on a single column:

>>>res_1, res_2 = my_fun(df['a'])
>>>print(res_1)

0    3
1    6
2    9

>>>print(res_2)

0    11
1    12
2    13

But when I try to use apply to get two dataframes as result, I get an error.

>>>res_1, res_2 = df.apply(my_fun, axis=0)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-173-142501cd22f6> in <module>()
     23 #     return(values_2+10, values_2*3)
     24 
---> 25 res_1, res_2 = df.apply(my_fun, axis=0)

ValueError: too many values to unpack (expected 2)

Any clue? Note that this is just an ilustrative example.


UPDATE:

I am really aiming at applying a function columnwise rather than other workaround (as this example is only ilustrative). The example provided above is ambiguous, a better one would be the following, where I want to add average of each column:

>>>import numpy as np
>>>def my_fun_2(values):
>>>    return(values+np.mean(values), values*3)

Upvotes: 2

Views: 517

Answers (2)

r.ook
r.ook

Reputation: 13898

The key issue is that your function is returning a tuple object, and what res_1, res_2 = my_fun(df['a']) is doing is unpacking the returned tuples res_1 and res_2 as Series object.

To illustrate:

df.apply(my_fun)

# a       ([11, 12, 13], [3, 6, 9])
# b    ([14, 15, 16], [12, 15, 18])
# c    ([20, 21, 22], [30, 33, 36])
# dtype: object

df.applymap(my_fun)

#          a         b         c
# 0  (11, 3)  (14, 12)  (20, 30)
# 1  (12, 6)  (15, 15)  (21, 33)
# 2  (13, 9)  (16, 18)  (22, 36)

You can manually unpack these into two DataFrames after the apply call if you want, but as you can see it's tedious:

df1 = df.apply(my_fun, axis = 0).apply(lambda x: x[0]).transpose()
df2 = df.apply(my_fun, axis = 0).apply(lambda x: x[1]).transpose()

df1

#        a   b   c
#    0  11  14  20
#    1  12  15  21
#    2  13  16  22

df2

#       a   b   c
#    0  3  12  30
#    1  6  15  33
#    2  9  18  36

So unless you want to change your function to return a different object type or manually unpack these into two different DataFrame objects, @Wen's solution is the best and simplest for you.

Upvotes: 1

BENY
BENY

Reputation: 323376

Seems like you just need call the func like this

df1,df2=my_fun(df)
df1
Out[1455]: 
    a   b   c
0  11  14  20
1  12  15  21
2  13  16  22
df2
Out[1456]: 
   a   b   c
0  3  12  30
1  6  15  33
2  9  18  36

The manner in which your function returns values currently does not make it suitable for use with apply.

Upvotes: 4

Related Questions