s900n
s900n

Reputation: 3375

Python Pandas: Label data with condition on rows

I have a dataframe such as below:

 ID   Label
 1      1
 2     NaN
 3      3
 4     NaN
 5      1
 6     NaN
 7     NaN
 8      3

What I want to do is at the label column if the row is between label 1 and 3, I want to label it as 2. Example output:

 ID   Label
 1      1
 2      2
 3      3
 4     NaN
 5      1
 6      2
 7      2 
 8      3

What is the best way to do this. Thank you=)

Upvotes: 1

Views: 1468

Answers (3)

Andy L.
Andy L.

Reputation: 25239

I think subtracting bfill from ffill will give rows where value is NaN and monotonic-increase from 1 to 3 which have value -2:

df.Label.ffill().sub(df.Label.bfill())

Out[692]:
0    0.0
1   -2.0
2    0.0
3    2.0
4    0.0
5   -2.0
6   -2.0
7   -2.0
8    0.0
Name: Label, dtype: float64

So, doing eq(-2) on it to create mask and assign 2 to loc

m = df.Label.ffill().sub(df.Label.bfill()).eq(-2)
df.loc[m, 'Label'] = 2


Out[712]:
   ID  Label
0   1    1.0
1   2    2.0
2   3    3.0
3   4    NaN
4   5    1.0
5   6    2.0
6   7    2.0
7   8    2.0
8   9    3.0

Note: I added one more row to your sample to test cases of 3 consecutive NaNs

Upvotes: 0

BENY
BENY

Reputation: 323226

In your case

s=(df.Label.ffill()+df.Label.bfill())
s2=(df.Label.ffill().astype(str)+df.Label.bfill().astype(str))
df.loc[df.Label.isnull()&s2.eq('1.03.0'),'Label']=s/2
df
Out[528]: 
   ID  Label
0   1    1.0
1   2    2.0
2   3    3.0
3   4    NaN
4   5    1.0
5   6    2.0
6   7    2.0
7   8    3.0

Upvotes: 0

Erfan
Erfan

Reputation: 42886

First we make three masks (read: we mark rows with True and False)

  1. All the rows which are NaN with isna
  2. Rows which are 1 after we use ffill (forwardfill).
  3. Rows which are 3 after we use bfill to catch edgecase (see comment @busybear)

Then we conditionally fill in the value 2 if all 3 conditions are True with np.where:

m1 = df['Label'].isna()
m2 = df['Label'].ffill().eq(1)
m3 = df['Label'].bfill().eq(3)

df['Label'] = np.where(m1 & m2 & m3, 2, df['Label'])

   ID  Label
0   1    1.0
1   2    2.0
2   3    3.0
3   4    NaN
4   5    1.0
5   6    2.0
6   7    2.0
7   8    3.0

Upvotes: 1

Related Questions