Reputation: 377
With a data frame:
df = pd.DataFrame({"ID": [1, 1, 1,1,2,2, 2, 2,3,3,3,3] ,"year": [2004, 2005, 2006, 2007, 2004, 2005, 2006, 2007,2004, 2005, 2006, 2007], "dummy": [0,1,1,0,1,1,0,0,0,0,0,0]})
ID year dummy
1 2004 0
1 2005 1
1 2006 1
1 2007 0
2 2004 1
2 2005 1
2 2006 0
2 2007 0
3 2004 0
3 2005 0
3 2006 0
3 2007 0
I want to generate a new column: 'dummy1' from 'dummy' column
ID year dummy dummy1
1 2004 0 0
1 2005 1 1
1 2006 1 0
1 2007 0 0
2 2004 1 1
2 2005 1 0
2 2006 0 0
2 2007 0 0
3 2004 0 0
3 2005 0 0
3 2006 0 0
3 2007 0 0
If I use:
df['dummy1'] = df["dummy"].replace({"1": "0"}, inplace=False)
It replaces all the one to zero. How can I modify the code to replace only the second 1 to 0, for each ID?
Upvotes: 1
Views: 104
Reputation: 120401
Without groupby
, using drop_duplicates
df['dummy1'] = 0
df.loc[df[df['dummy'].eq(1)].drop_duplicates('ID').index, 'dummy1'] = 1
Output:
>>> df
ID year dummy dummy1
0 1 2004 0 0
1 1 2005 1 1
2 1 2006 1 0
3 1 2007 0 0
4 2 2004 1 1
5 2 2005 1 0
6 2 2006 0 0
7 2 2007 0 0
8 3 2004 0 0
9 3 2005 0 0
10 3 2006 0 0
11 3 2007 0 0
Upvotes: 1
Reputation: 377
df['dum'] = df.groupby('ID')['dummy'].cummax()
Then,
conditions = [
(df['dum']== 0),
(df['dum']== 1)
]
values = [0, (df.index == df.groupby('ID').dummy.transform('idxmax')).astype(int)]
df['dummy1'] = np.select(conditions, values)
ID year dummy dummy1
1 2004 0 0
1 2005 1 1
1 2006 1 0
1 2007 0 0
2 2004 1 1
2 2005 1 0
2 2006 0 0
2 2007 0 0
3 2004 0 0
3 2005 0 0
3 2006 0 0
3 2007 0 0
Upvotes: 0
Reputation: 323226
Try with idxmax
with transform
df['new'] = (df.index == df.groupby('ID').dummy.transform('idxmax')).astype(int)
df
ID year dummy new
0 1 2004 0 0
1 1 2005 1 1
2 1 2006 1 0
3 1 2007 0 0
4 2 2004 1 1
5 2 2005 1 0
6 2 2006 0 0
7 2 2007 0 0
Upvotes: 1
Reputation: 18406
Combine dummy value equals 1 and dummy value shifted by -1 equals 1 by and, then convert it integer type, finally assign it to the new column:
>>> df['dummy1'] = ((df['dummy'].eq(1)) & (df['dummy'].shift(-1).eq(1))).astype(int)
ID year dummy dummy1
0 1 2004 0 0
1 1 2005 1 1
2 1 2006 1 0
3 1 2007 0 0
4 2 2004 1 1
5 2 2005 1 0
6 2 2006 0 0
7 2 2007 0 0
Upvotes: 1