Olive
Olive

Reputation: 644

Groupby and convert the first value to NaN

My df

   id  date    dummy
0  A   2019Q1    1
1  A   2019Q2    0
2  A   2019Q3    0
3  B   2019Q1    1
4  B   2019Q2    1
5  B   2019Q3    0

How can I groupby id and then convert the earliest value to NaN?

output

   id  date    dummy
0  A   2019Q1    NaN
1  A   2019Q2    0
2  A   2019Q3    0
3  B   2019Q1    NaN
4  B   2019Q2    1
5  B   2019Q3    0

Upvotes: 1

Views: 44

Answers (2)

Alex
Alex

Reputation: 139

import pandas as pd

indices = df.reset_index().groupby("id")["index"].first().to_list()

df.loc[indices,'dummy'] = np.NaN

Upvotes: 0

Corralien
Corralien

Reputation: 120409

Use a boolean mask (assuming each rows are already sorted for each group):

df.loc[~df['id'].duplicated(), 'dummy'] = np.nan
print(df)

# Output
  id    date  dummy
0  A  2019Q1    NaN
1  A  2019Q2    0.0
2  A  2019Q3    0.0
3  B  2019Q1    NaN
4  B  2019Q2    1.0
5  B  2019Q3    0.0

Or:

df.loc[df.groupby('id').cumcount().eq(0), 'dummy'] = np.nan
print(df)

# Output
  id    date  dummy
0  A  2019Q1    NaN
1  A  2019Q2    0.0
2  A  2019Q3    0.0
3  B  2019Q1    NaN
4  B  2019Q2    1.0
5  B  2019Q3    0.0

Upvotes: 2

Related Questions