Johnny Tam
Johnny Tam

Reputation: 495

Pandas divide every nth row

I want to use every 5th row as a reference row (ref_row), divide this ref_row starting from this ref_row and do the same for the next 4 rows.

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
    
len = df.shape[0]

for idx in range(0,len,5):

    ref_row = df.iloc[idx:idx+1,:]

    for idx_next in range(idx,idx+5):

        df.iloc[idx_next:idx_next+1,:] = df.iloc[idx_next:idx_next+1,:].div(ref_row)

However, I got all NaN except the ref_row.

    A   B   C   D
0   1.0 1.0 1.0 1.0
1   NaN NaN NaN NaN
2   NaN NaN NaN NaN
3   NaN NaN NaN NaN
4   NaN NaN NaN NaN
... ... ... ... ...
95  1.0 1.0 1.0 1.0
96  NaN NaN NaN NaN
97  NaN NaN NaN NaN
98  NaN NaN NaN NaN
99  NaN NaN NaN NaN

Any idea what's wrong?

Upvotes: 0

Views: 1302

Answers (1)

user7864386
user7864386

Reputation:

The problem with your code is that with df.iloc[idx_next:idx_next+1,:] and df.iloc[idx:idx+1,:], you're indexing df rows as DF objects. So when you divide, the indices don't match and you get NaN. Replace

df.iloc[idx_next:idx_next+1,:] 

with

df.iloc[idx_next]

and

df.iloc[idx:idx+1,:] 

with

df.iloc[idx] 

everywhere, it will work as expected (because they're now Series objects, so the indices match).

You can also repeat the array of every fifth row of the DataFrame using np.repeat on axis=0, then element-wise divide it with the resulting array:

out = df.div(np.repeat(df[::5].to_numpy(), 5, axis=0))

Output:

           A         B         C         D
0   1.000000  1.000000  1.000000  1.000000
1   0.726190  0.359375  0.967742  1.644068
2   0.130952  0.046875  0.161290  0.406780
3   0.488095  0.312500  0.919355  0.305085
4   0.857143  0.203125  0.967742  0.525424
..       ...       ...       ...       ...
95  1.000000  1.000000  1.000000  1.000000
96  0.061224  1.400000  0.518519  0.882353
97  1.510204  1.300000  1.740741  5.588235
98  0.224490  2.100000  1.407407  0.294118
99  1.061224  1.400000  1.388889  3.411765

[100 rows x 4 columns]

Upvotes: 2

Related Questions