Josh Friedlander
Josh Friedlander

Reputation: 11657

Loop through several values to fill NaNs in Pandas Dataframe

I know that I cannot fill NaNs with a list, as stated in the documentation for fillna. What, then, is the preferred way to use a list of values to fill NaNs? Desired behaviour is to go through the list and fill NaNs one at a time; if there are more NaNs than in the list then start over. Example:

np.random.seed(0)
s = pd.Series(np.random.randint(0,100, 50))
s.loc[s > 25] = np.nan
s.fillna([10, 20, 30]) # Produces TypeError 

Desired output:

0   10
1   20
2   30
3   10
4   20
5   9.0
6   30
7   21.0
8   10

etc.

Is this not built-in because it's difficult to vectorise? For what it's worth, this is just theoretical, I don't have actual data.

Upvotes: 1

Views: 241

Answers (2)

jpp
jpp

Reputation: 164753

There's no need to convert values to NaN first. So let's assume this starting point:

np.random.seed(0)
s = pd.Series(np.random.randint(0,100, 50))

Then you can use loc with np.resize:

mask = s > 25
s.loc[mask] = np.resize([10, 20, 30], mask.sum())

Alternatively, with pd.Series.mask:

s = s.mask(s > 25, np.resize([10, 20, 30], len(s.index)))

Result:

print(s.head(10))
# 0    10
# 1    20
# 2    30
# 3    10
# 4    20
# 5     9
# 6    30
# 7    21
# 8    10
# 9    20
# dtype: int32

Upvotes: 1

BENY
BENY

Reputation: 323316

Using

s.loc[s.isna()]=[10,20,30]*(s.isna().sum()//3)+[10,20,30][:s.isna().sum()%3]
s
Out[271]: 
0     10.0
1     20.0
2     30.0
3     10.0
4     20.0
5      9.0
6     30.0
...

Upvotes: 2

Related Questions