Reputation: 111
I am continuing to work on my pandas skills and have run up against some difficult problems. This problem involves two dataframes df1 and df2. df1 contains event times and the corresponding details of each event. df2 contains a time period established by a start time and a stop time.
GOAL:
code:
import pandas as pd
df1 = {'Event': ['2020-12-01 00:10:22', '2020-12-01 00:15:11','2020-12-01 00:18:00',
'2020-12-01 00:31:00', '2020-12-01 00:54:00' , '2020-12-01 01:01:00' ,
'2020-12-01 01:19:00' , '2020-12-01 01:23:00' , '2020-12-01 01:24:00' ,
'2020-12-01 01:56:00' , '2020-12-01 21:02:00', '2020-12-01 02:41:00',
'2020-12-01 02:44:00' , '2020-12-01 03:19:00' ,'2020-12-01 03:22:00' ,
'2020-12-01 03:49:00' , '2020-12-01 05:24:00' ,'2020-12-01 05:56:00' ,
'2020-12-01 08:02:00'
] ,
'Count' : [1 , 2 , 4 , 2 , 5 ,
3 , 7 , 9 , 10 , 1 ,
2 , 5 , 6 , 10 , 5 ,
6 ,7 , 8 , 3] ,
'Code' : ['A' , 'A' , 'A' , 'A' , 'B' ,
'B' , 'B' , 'B' , 'B' , 'B' ,
'C' , 'C' , 'C' , 'C' , 'C' ,
'D' , 'D' , 'D' , 'D']
}
df1 = pd.DataFrame(df1 , columns = ['Event' , 'Count' , 'Code'])
df1['Event'] = pd.to_datetime(df1['Event'])
df
Event Count Code
0 2020-12-01 00:10:22 1 A
1 2020-12-01 00:15:11 2 A
2 2020-12-01 00:18:00 4 A
3 2020-12-01 00:31:00 2 A
4 2020-12-01 00:54:00 5 B
5 2020-12-01 01:01:00 3 B
6 2020-12-01 01:19:00 7 B
7 2020-12-01 01:23:00 9 B
8 2020-12-01 01:24:00 10 B
9 2020-12-01 01:56:00 1 B
10 2020-12-01 21:02:00 2 C
11 2020-12-01 02:41:00 5 C
12 2020-12-01 02:44:00 6 C
13 2020-12-01 03:19:00 10 C
14 2020-12-01 03:22:00 5 C
15 2020-12-01 03:49:00 6 D
16 2020-12-01 05:24:00 7 D
17 2020-12-01 05:56:00 8 D
18 2020-12-01 08:02:00 3 D
creating df2 code :
df2 = {'Start Time' : ['2020-12-01 00:00:00', '2020-12-01 00:30:00','2020-12-01 01:30:00',
'2020-12-01 02:30:00', '2020-12-01 03:30:00' , '2020-12-01 04:30:00' ,
'2020-12-01 05:30:00' , '2020-12-01 07:30:00' , '2020-12-01 10:30:00' ,
'2020-12-01 15:00:00' , '2020-12-02 21:00:00'] ,
'End Time' : ['2020-12-01 00:30:00', '2020-12-01 01:30:00','2020-12-01 02:30:00',
'2020-12-01 03:30:00', '2020-12-01 04:30:00' , '2020-12-01 05:30:00' ,
'2020-12-01 07:30:00' , '2020-12-01 10:30:00' , '2020-12-01 15:00:00' ,
'2020-12-01 21:00:00' , '2020-12-02 00:00:00']
}
df2 = pd.DataFrame(df2 , columns = ['Start Time' , 'End Time'])
df2['Start Time'] = pd.to_datetime(df2['Start Time'])
df2['End Time'] = pd.to_datetime(df2['End Time'])
df2
Start Time End Time
0 2020-12-01 00:00:00 2020-12-01 00:30:00
1 2020-12-01 00:30:00 2020-12-01 01:30:00
2 2020-12-01 01:30:00 2020-12-01 02:30:00
3 2020-12-01 02:30:00 2020-12-01 03:30:00
4 2020-12-01 03:30:00 2020-12-01 04:30:00
5 2020-12-01 04:30:00 2020-12-01 05:30:00
6 2020-12-01 05:30:00 2020-12-01 07:30:00
7 2020-12-01 07:30:00 2020-12-01 10:30:00
8 2020-12-01 10:30:00 2020-12-01 15:00:00
9 2020-12-01 15:00:00 2020-12-01 21:00:00
10 2020-12-01 21:00:00 2020-12-02 00:00:00
Strategy:
My strategy is to use pd.Dataframe.between_time and then a lambda function for the conditional statement checks but I cannot seem to make it work.
Below is the desired output:
Start Time End Time Test
0 2020-12-01 00:00:00 2020-12-01 00:30:00 True
1 2020-12-01 00:30:00 2020-12-01 01:30:00 False
2 2020-12-01 01:30:00 2020-12-01 02:30:00 True
3 2020-12-01 02:30:00 2020-12-01 03:30:00 False
4 2020-12-01 03:30:00 2020-12-01 04:30:00 True
5 2020-12-01 04:30:00 2020-12-01 05:30:00 True
6 2020-12-01 05:30:00 2020-12-01 07:30:00 True
7 2020-12-01 07:30:00 2020-12-01 10:30:00 True
8 2020-12-01 10:30:00 2020-12-01 15:00:00 False
9 2020-12-01 15:00:00 2020-12-01 21:00:00 False
10 2020-12-02 21:00:00 2020-12-02 00:00:00 False
Upvotes: 1
Views: 43
Reputation: 71687
You can use a custom function that yields a boolean value based on the specified conditions. Here we have used Series.is_monotonic_increasing
to check if Count
in increasing and Series.nunique
to check if all codes are same for the given time period:
def f():
for x, y in df2.to_numpy():
s = df1[df1['Event'].between(x, y)]
yield s['Count'].is_monotonic_increasing & s['Code'].nunique() == 1
df2['Test'] = list(f())
Result:
Start Time End Time Test
0 2020-12-01 00:00:00 2020-12-01 00:30:00 True
1 2020-12-01 00:30:00 2020-12-01 01:30:00 False
2 2020-12-01 01:30:00 2020-12-01 02:30:00 True
3 2020-12-01 02:30:00 2020-12-01 03:30:00 False
4 2020-12-01 03:30:00 2020-12-01 04:30:00 True
5 2020-12-01 04:30:00 2020-12-01 05:30:00 True
6 2020-12-01 05:30:00 2020-12-01 07:30:00 True
7 2020-12-01 07:30:00 2020-12-01 10:30:00 True
8 2020-12-01 10:30:00 2020-12-01 15:00:00 False
9 2020-12-01 15:00:00 2020-12-01 21:00:00 False
10 2020-12-02 21:00:00 2020-12-02 00:00:00 False
Upvotes: 1