Loop through several values to fill NaNs in Pandas Dataframe

Question

I know that I cannot fill NaNs with a list, as stated in the documentation for fillna. What, then, is the preferred way to use a list of values to fill NaNs? Desired behaviour is to go through the list and fill NaNs one at a time; if there are more NaNs than in the list then start over. Example:

np.random.seed(0)
s = pd.Series(np.random.randint(0,100, 50))
s.loc[s > 25] = np.nan
s.fillna([10, 20, 30]) # Produces TypeError

Desired output:

etc.

Is this not built-in because it's difficult to vectorise? For what it's worth, this is just theoretical, I don't have actual data.

jpp · Accepted Answer

There's no need to convert values to NaN first. So let's assume this starting point:

np.random.seed(0)
s = pd.Series(np.random.randint(0,100, 50))

Then you can use loc with np.resize:

mask = s > 25
s.loc[mask] = np.resize([10, 20, 30], mask.sum())

Alternatively, with pd.Series.mask:

s = s.mask(s > 25, np.resize([10, 20, 30], len(s.index)))

Result:

print(s.head(10))
# 0    10
# 1    20
# 2    30
# 3    10
# 4    20
# 5     9
# 6    30
# 7    21
# 8    10
# 9    20
# dtype: int32

Loop through several values to fill NaNs in Pandas Dataframe

Answers (2)

Related Questions