Reputation: 634
I have the following df
and would like to drop the rows (date
- d/m/y) before offer_date
and after mature_date
for each id
. The final df would look like final df
. Any help with this groupby
would be awesome!
df
date offer_date mature_date id a b c d
0 1/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
1 2/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
2 3/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
3 4/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
4 5/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
5 6/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
6 7/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
7 8/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
8 9/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
9 10/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
10 1/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
11 2/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
12 3/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
13 4/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
14 5/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
15 6/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
16 7/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
17 8/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
18 9/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
19 10/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
20 1/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
21 2/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
22 3/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
23 4/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
24 5/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
25 6/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
26 7/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
27 8/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
28 9/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
29 10/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
final df
date offer_date mature_date id a b c d
3 4/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
4 5/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
5 6/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
6 7/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
7 8/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
8 9/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
10 1/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
11 2/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
12 3/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
13 4/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
14 5/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
21 2/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
22 3/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
23 4/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
Upvotes: 0
Views: 54
Reputation: 29635
you don't need groupby
actually, you can do a global mask
as all dates can be compared row wise. Convert to_datetime
the columns to ensure the good comparisons.
# format for dates with day first
date_format = '%d/%m/%Y'
# create the mask with both inequality
# convert to dateitme to be ensure the selection
mask = (
pd.to_datetime(df['date'], format=date_format)
.ge(pd.to_datetime(df['offer_date'], format=date_format))
& pd.to_datetime(df['date'], format=date_format)
.le(pd.to_datetime(df['mature_date'], format=date_format))
)
print(df.loc[mask])
date offer_date mature_date id a b c d
3 4/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
4 5/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
5 6/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
6 7/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
7 8/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
8 9/1/2000 4/1/2000 9/1/2000 1 10 20 10.0 11
10 1/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
11 2/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
12 3/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
13 4/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
14 5/1/2000 1/1/2000 5/1/2000 2 30 30 40.0 15
21 2/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
22 3/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
23 4/1/2000 2/1/2000 4/1/2000 3 33 35 40.0 15
EDIT: as @Henry Ecker pointed out, in this case with a simple date format, one can use the parameter dayfirst
, so using this pd.to_datetime(df['date'], dayfirst=True)
for each date columns instead of defining the format would work the same.
Upvotes: 2