Alexander Thomsen
Alexander Thomsen

Reputation: 469

Getting NaN for dividing each row value by row sum

I'm trying something very simple, I got a Dataframe made of 1 and 0. I'm trying to divide each value in row by the sum of the row, so it will be weight so that the row sums to 1

trading_signal sample

             btc.  eth
2021/08/25.  1.    0
2021/08/26.  1.    1
2021/08/27.  0.    0 

position expected output

             btc.  eth
2021/08/25.  1     0
2021/08/26.  0.5   0.5
2021/08/27.  0     0 

I imagine it to just

positions = trading_signals / trading_signals.sum(axis=1)

But the positions df just populated with NaNs

Upvotes: 0

Views: 288

Answers (2)

Niv Dudovitch
Niv Dudovitch

Reputation: 1658

Several options + comparison of running times:

  1. apply
df.apply(lambda row: row / row.sum(axis=0), axis=1).fillna(0)

1.18 ms ± 5.08 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
  1. transpose
(df.T / df.T.sum()).T.fillna(0)

1.54 ms ± 843 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
  1. div
df.div(df.sum(axis=1), axis=0).fillna(0)

576 µs ± 354 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Option 3 has the lowest runtime - Probably because it makes the most of the vectoring capabilities of pandas-python

Same result for all of them:

enter image description here

Upvotes: 1

mozway
mozway

Reputation: 261850

You need to divide on axis=0, which is not the default with /. Use div instead:

df.div(df.sum(axis=1), axis=0)

NB. division by 0 will give you NaNs, so add .fillna(0) to fill with 0

Upvotes: 1

Related Questions