Moose
Moose

Reputation: 148

Applying functions to a pandas DataFrame gives Value Error (just one argument)

It seems I can apply some functions without problems to a DataFrame, but other give a Value Error.

dates = pd.date_range('20130101',periods=6)
data = np.random.randn(6,4)

df = pd.DataFrame(data,index=dates,columns=list('ABCD'))

def my_max(y):
    return max(y,0)

def times_ten(y):
    return 10*y

df.apply(lambda x:times_ten(x)) # Works fine
df.apply(lambda x:my_max(x)) # Doesn't work

The first apply works fine, the second one generates a:

ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index A')

I know I can generate the "max(df,0)" in other ways (e.g. by df[df<0]=0), so I'm not looking for a solution to this particular problem. Rather, I'm interested in why the apply above doesn't work.

Upvotes: 3

Views: 3052

Answers (1)

behzad.nouri
behzad.nouri

Reputation: 77941

max cannot handle a scalar and an array:

>>> max(df['A'], 0)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

either use np.maximum which does element-wise maximum:

>>> def my_max(y):
...     return np.maximum(y, 0)
... 
>>> df.apply(lambda x:my_max(x))
                A      B      C      D
2013-01-01  0.000  0.000  0.178  0.992
2013-01-02  0.000  1.060  0.000  0.000
2013-01-03  0.528  2.408  2.679  0.000
2013-01-04  0.564  0.573  0.320  1.220
2013-01-05  0.903  0.497  0.000  0.032
2013-01-06  0.505  0.000  0.000  0.000

or use .applymap which operates elementwise:

>>> def my_max(y):
...     return max(y,0)
... 
>>> df.applymap(lambda x:my_max(x))
                A      B      C      D
2013-01-01  0.000  0.000  0.178  0.992
2013-01-02  0.000  1.060  0.000  0.000
2013-01-03  0.528  2.408  2.679  0.000
2013-01-04  0.564  0.573  0.320  1.220
2013-01-05  0.903  0.497  0.000  0.032
2013-01-06  0.505  0.000  0.000  0.000

Upvotes: 4

Related Questions