Reputation: 465
I've been using pandas apply method for both series and dataframe, but I am obviously still missing something, because I'm stumped on a simple function i'm trying to execute.
This is what I was doing:
def minmax(row):
return (row - row.min())/(row.max() - row.min())
row.apply(minmax)
but, this returns an all zero Series. For example, if
row = pd.Series([0, 1, 2])
then
minmax(row)
returns [0.0, 0.5, 1.0], as desired. But, row.apply(minmax) returns [0,0,0].
I believe this is because the series is of ints and the integer division returns 0. However, I don't understand,
i suspect i'm missing something fundamental in how the apply works... or being dense. either way, thanks in advance.
Upvotes: 2
Views: 6314
Reputation: 21888
When you call row.apply(minmax)
on a Series
only the values are passed to the function. This is called element-wise.
Invoke function on values of Series. Can be ufunc (a NumPy function that applies to the entire Series) or a Python function that only works on single values.
When you call row.apply(minmax)
on a DataFrame
either rows (default) or columns are passed to the function (according to the value of axis
).
Objects passed to functions are Series objects having index either the DataFrame’s index (axis=0) or the columns (axis=1). Return type depends on whether passed function aggregates, or the reduce argument if the DataFrame is empty. This is called row-wise or column-wise.
This is why your example works as expected on the DataFrame
and not on the Series
. Check this answer for information on mapping functions to Series
.
Upvotes: 3