Bumbaro2217
Bumbaro2217

Reputation: 35

Replace the first value of each group

I have a df sorted by EmpID and Date. Now I need to update the Reason column of the first occurrence for each EmpID with the value 100.

Original:

EmpID Date Reason
123 01/01/2022 0
123 01/02/2022 500
123 01/03/2022 0
124 01/01/2022 0
124 01/02/2022 800
124 01/03/2022 0

Result:

EmpID Date Reason
123 01/01/2022 100
123 01/02/2022 500
123 01/03/2022 0
124 01/01/2022 100
124 01/02/2022 800
124 01/03/2022 0

I'm not getting close in any attempts. Would it be easier to update the min date of each EmpID and set Reason to 100?

I have tried the following but received an error.

DF.loc[DF.groupby(['EmpID','Date'])['Reason'].head(1), 'Reason'] = '100'

This raises:

KeyError: "None of [Float64Index([ 0.0, 800.0, 0.0, 800.0, 0.0, 800.0, 0.0, 800.0, 0.0,\n 800.0,\n ...\n 800.0, 0.0, 100.0, 800.0, 0.0, 100.0, 800.0, 0.0, 100.0,\n 800.0],\n dtype='float64', name='EEID', length=12769)] are in the [index]"

Thank you for any help!

Upvotes: 3

Views: 795

Answers (3)

Noman
Noman

Reputation: 163

for solving this problem we need to identify the row number if we get the row number we can filter based on 1st row and change the reason to 100. we can get row number using this function cumcount

data['row_num'] =data.groupby(['EmpID']).cumcount()
data.loc[data.row_num == 0,'Reason'] = 100
data

enter image description here

Upvotes: 0

BENY
BENY

Reputation: 323396

Check duplicated

df.loc[~df['EmpID'].duplicated(),'Reason'] = 100

Upvotes: 3

Vladimir Fokow
Vladimir Fokow

Reputation: 3883

You can use .transform. Just define a function that returns a mask: with all the values False except for the first one: True.

With the combined mask of all the groups you can then select and replace the elements that you need:

def f(s):
    mask = np.full(len(s), False)
    mask[0] = True
    return mask

mask = df.groupby('EmpID')['Reason'].transform(f)
df.loc[mask, 'Reason'] = 100

Result:

   EmpID       Date  Reason
0    123 2022-01-01     100
1    123 2022-01-02     500
2    123 2022-01-03       0
3    124 2022-01-01     100
4    124 2022-01-02     800
5    124 2022-01-03       0

Upvotes: 0

Related Questions