Reputation: 541
I have a input dataset as follows:
date event A
2015-07-15 A 1
2015-07-16 A 1
.......
2015-07-22 A 1
2015-07-23 A 1
........
2015-07-26 A 1
Here A is the dummy/indicator variable for event A. I want to further subset the dummy variable, where your A_new(new indicator variable) is 1 if date is in the range 2015-07-15 and 2015-07-22, else it remains 0.
Output is:
date event A A_new
2015-07-15 A 1 1
2015-07-16 A 1 1
.......
2015-07-22 A 1 1
2015-07-23 A 1 0
........
2015-07-26 A 1 0
My code looks like:
df['A_new']=df['A']
df.loc[~((df.date >='2015-07-15') & (df.date <='2015-07-22')),'A_new']=0
But still I am not get desired result. Just wonder my logic is correct.
Upvotes: 2
Views: 1527
Reputation: 863411
Simplier solution is cast boolean mask to int
- so True
are 1
and False
are 0
:
df['A_new'] = ((df.date >='2015-07-15') & (df.date <='2015-07-22')).astype(int)
print (df)
date event A A_new
0 2015-07-15 A 1 1
1 2015-07-16 A 1 1
2 2015-07-22 A 1 1
3 2015-07-23 A 1 0
4 2015-07-26 A 1 0
Another solution with between
:
df['A_new'] = df.date.between('2015-07-15','2015-07-22').astype(int)
print (df)
date event A A_new
0 2015-07-15 A 1 1
1 2015-07-16 A 1 1
2 2015-07-22 A 1 1
3 2015-07-23 A 1 0
4 2015-07-26 A 1 0
As piRSquared pointed if need values by column A
multiple by A
(thank you):
print (df)
date event A
0 2015-07-15 A 1
1 2015-07-16 A 6
2 2015-07-22 A 2
3 2015-07-23 A 1
4 2015-07-26 A 1
df['A_new'] = df.date.between('2015-07-15','2015-07-22').astype(int).mul(df.A)
print (df)
date event A A_new
0 2015-07-15 A 1 1
1 2015-07-16 A 6 6
2 2015-07-22 A 2 2
3 2015-07-23 A 1 0
4 2015-07-26 A 1 0
Solution with where
:
df['A_new'] = df.A.where(df.date.between('2015-07-15','2015-07-22'), 0)
print (df)
date event A A_new
0 2015-07-15 A 1 1
1 2015-07-16 A 6 6
2 2015-07-22 A 2 2
3 2015-07-23 A 1 0
4 2015-07-26 A 1 0
Upvotes: 3