Reputation: 469
I'm trying something very simple, I got a Dataframe made of 1 and 0. I'm trying to divide each value in row by the sum of the row, so it will be weight so that the row sums to 1
trading_signal sample
btc. eth
2021/08/25. 1. 0
2021/08/26. 1. 1
2021/08/27. 0. 0
position expected output
btc. eth
2021/08/25. 1 0
2021/08/26. 0.5 0.5
2021/08/27. 0 0
I imagine it to just
positions = trading_signals / trading_signals.sum(axis=1)
But the positions df just populated with NaNs
Upvotes: 0
Views: 288
Reputation: 1658
Several options + comparison of running times:
df.apply(lambda row: row / row.sum(axis=0), axis=1).fillna(0)
1.18 ms ± 5.08 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
(df.T / df.T.sum()).T.fillna(0)
1.54 ms ± 843 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
df.div(df.sum(axis=1), axis=0).fillna(0)
576 µs ± 354 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Option 3 has the lowest runtime - Probably because it makes the most of the vectoring capabilities of pandas-python
Same result for all of them:
Upvotes: 1
Reputation: 261850
You need to divide on axis=0
, which is not the default with /
. Use div
instead:
df.div(df.sum(axis=1), axis=0)
NB. division by 0 will give you NaNs, so add .fillna(0)
to fill with 0
Upvotes: 1