Reputation: 579
I am wanting to replace the null values with the year value. Given the following dataframe:
year value
2000 1
NaN 2
NaN 3
NaN 4
NaN 5
NaN 6
2001 1
NaN 2
NaN 3
NaN 4
NaN 5
NaN 6
...
2020 1
NaN 2
NaN 3
NaN 4
NaN 5
NaN 6
The null values between the years, for example 2000 and 2001, I would like to substitute for 2000 until arriving in 2001 and so on. It should look something like this:
year value
2000 1
2000 2
2000 3
2000 4
2000 5
2000 6
2001 1
2001 2
2001 3
2001 4
2001 5
2001 6
...
2020 1
2020 2
2020 3
2020 4
2020 5
2020 6
I tried to do this:
size = df["year"].size
val = df.iloc[0,0]
for i in range(size):
if df.iloc[i,0]==None:
df.iloc[i,0]=val
else:
val = df.iloc[i,0]
But the dataframe remains the same. Apparently the condition if df.iloc [i, 0] == None
does not work. In this sense, how to check if a column element is null?
Upvotes: 0
Views: 334
Reputation: 34066
Use df.ffill()
:
In [1067]: df.year = df.year.ffill()
In [1068]: df
Out[1068]:
year value
0 2000.0 1
1 2000.0 2
2 2000.0 3
3 2000.0 4
4 2000.0 5
5 2000.0 6
6 2001.0 1
7 2001.0 2
8 2001.0 3
9 2001.0 4
10 2001.0 5
11 2001.0 6
Upvotes: 1