Reputation: 644
My df
id date dummy
0 A 2019Q1 1
1 A 2019Q2 0
2 A 2019Q3 0
3 B 2019Q1 1
4 B 2019Q2 1
5 B 2019Q3 0
How can I groupby id and then convert the earliest value to NaN?
output
id date dummy
0 A 2019Q1 NaN
1 A 2019Q2 0
2 A 2019Q3 0
3 B 2019Q1 NaN
4 B 2019Q2 1
5 B 2019Q3 0
Upvotes: 1
Views: 44
Reputation: 139
import pandas as pd
indices = df.reset_index().groupby("id")["index"].first().to_list()
df.loc[indices,'dummy'] = np.NaN
Upvotes: 0
Reputation: 120409
Use a boolean mask (assuming each rows are already sorted for each group):
df.loc[~df['id'].duplicated(), 'dummy'] = np.nan
print(df)
# Output
id date dummy
0 A 2019Q1 NaN
1 A 2019Q2 0.0
2 A 2019Q3 0.0
3 B 2019Q1 NaN
4 B 2019Q2 1.0
5 B 2019Q3 0.0
Or:
df.loc[df.groupby('id').cumcount().eq(0), 'dummy'] = np.nan
print(df)
# Output
id date dummy
0 A 2019Q1 NaN
1 A 2019Q2 0.0
2 A 2019Q3 0.0
3 B 2019Q1 NaN
4 B 2019Q2 1.0
5 B 2019Q3 0.0
Upvotes: 2