Reputation: 3375
I have a dataframe such as below:
ID Label
1 1
2 NaN
3 3
4 NaN
5 1
6 NaN
7 NaN
8 3
What I want to do is at the label column if the row is between label 1 and 3, I want to label it as 2. Example output:
ID Label
1 1
2 2
3 3
4 NaN
5 1
6 2
7 2
8 3
What is the best way to do this. Thank you=)
Upvotes: 1
Views: 1468
Reputation: 25239
I think subtracting bfill
from ffill
will give rows where value is NaN
and monotonic-increase from 1 to 3 which have value -2:
df.Label.ffill().sub(df.Label.bfill())
Out[692]:
0 0.0
1 -2.0
2 0.0
3 2.0
4 0.0
5 -2.0
6 -2.0
7 -2.0
8 0.0
Name: Label, dtype: float64
So, doing eq(-2)
on it to create mask and assign 2
to loc
m = df.Label.ffill().sub(df.Label.bfill()).eq(-2)
df.loc[m, 'Label'] = 2
Out[712]:
ID Label
0 1 1.0
1 2 2.0
2 3 3.0
3 4 NaN
4 5 1.0
5 6 2.0
6 7 2.0
7 8 2.0
8 9 3.0
Note: I added one more row to your sample to test cases of 3 consecutive NaN
s
Upvotes: 0
Reputation: 323226
In your case
s=(df.Label.ffill()+df.Label.bfill())
s2=(df.Label.ffill().astype(str)+df.Label.bfill().astype(str))
df.loc[df.Label.isnull()&s2.eq('1.03.0'),'Label']=s/2
df
Out[528]:
ID Label
0 1 1.0
1 2 2.0
2 3 3.0
3 4 NaN
4 5 1.0
5 6 2.0
6 7 2.0
7 8 3.0
Upvotes: 0
Reputation: 42886
First we make three masks (read: we mark rows with True
and False
)
NaN
with isna
1
after we use ffill
(forwardfill).3
after we use bfill
to catch edgecase (see comment @busybear)Then we conditionally fill in the value 2
if all 3 conditions are True
with np.where
:
m1 = df['Label'].isna()
m2 = df['Label'].ffill().eq(1)
m3 = df['Label'].bfill().eq(3)
df['Label'] = np.where(m1 & m2 & m3, 2, df['Label'])
ID Label
0 1 1.0
1 2 2.0
2 3 3.0
3 4 NaN
4 5 1.0
5 6 2.0
6 7 2.0
7 8 3.0
Upvotes: 1