Reputation: 1698
I'm trying to apply a function to every row in a pandas dataframe. The number of columns is variable but I'm using the index in the function as well
def pretend(np_array, index):
sum(np_array)*index
df = pd.DataFrame(np.arange(16).reshape(8,2))
answer = df.apply(pretend, axis=1, args=(df.index))
I shaped it to 8x2 but I'd like it to work on any shape I pass it.
Upvotes: 1
Views: 6045
Reputation: 394061
the index values can be accessed via the .name
attribute:
In [3]:
df = pd.DataFrame(data = np.random.randn(5,3), columns=list('abc'))
df
Out[3]:
a b c
0 -1.662047 0.794483 0.672300
1 -0.812412 -0.325160 -0.026990
2 -0.334991 0.412977 -2.016004
3 -1.337757 -1.328030 -1.005114
4 0.699106 -1.527408 -1.288385
In [8]:
def pretend(np_array):
return (np_array.sum())*np_array.name
df.apply(lambda x: pretend(x), axis=1)
Out[8]:
0 -0.000000
1 -1.164561
2 -3.876037
3 -11.012701
4 -8.466748
dtype: float64
You can see that the first row becomes 0
as the index value is 0
Upvotes: 4