Optimize filling nan loop

Question

so I have a part of my code I want to optimize

nan_rows = df.loc[df.Open.isna()].index
for i in nan_rows:
    df.Open.iloc[i] = df.Close.iloc[i-1]

What it do is it assigns nan values with the previous value of another column. I find this code to be slow and often times I have to apply this method to bigger dataframes. Is there any way to optimize this? Thank you

Anshul · Accepted Answer

IIUC, this might work. Even with multiple recurring NaN values in 'Open':

import pandas as pd

# sample dataset for read_clipboard()
'''
Close    Open
1.0   1.0
2.0   NaN
3.0   3.0
4.0   NaN
5.0   NaN
6.0   NaN
7.0   7.0
8.0   NaN
'''

df = pd.read_clipboard()
# print(df)

df input:

   Close  Open
0    1.0   1.0
1    2.0   NaN
2    3.0   3.0
3    4.0   NaN
4    5.0   NaN
5    6.0   NaN
6    7.0   7.0
7    8.0   NaN

.

df['Open'] = df['Open'].fillna(df['Close'].shift(1))
# print(df)

df output:

   Close  Open
0    1.0   1.0
1    2.0   1.0
2    3.0   3.0
3    4.0   3.0
4    5.0   4.0
5    6.0   5.0
6    7.0   7.0
7    8.0   7.0

Optimize filling nan loop

Answers (1)

Related Questions