Reputation: 568
I have date_time_df
dataframe with date_time
column as like this one:
import datetime
date_time_df = pd.DataFrame({'date_time':[datetime.datetime(2020, 7, 10, 9, 15, 3),
datetime.datetime(2020, 7, 9, 9, 50, 0),
datetime.datetime(2020, 7, 9, 10, 50, 3),
datetime.datetime(2020, 7, 9, 16, 50, 0),
datetime.datetime(2020, 7, 9, 20, 30, 0),
datetime.datetime(2020, 7, 8, 9, 50, 0),
datetime.datetime(2020, 7, 8, 10, 50, 3),
datetime.datetime(2020, 7, 8, 16, 50, 0),
datetime.datetime(2020, 7, 8, 20, 30, 0),
datetime.datetime(2020, 7, 7, 10, 50, 3),
datetime.datetime(2020, 7, 7, 16, 50, 0),
datetime.datetime(2020, 7, 6, 16, 50, 0),
datetime.datetime(2020, 7, 6, 20, 30, 0),
datetime.datetime(2020, 7, 5, 9, 50, 0),
datetime.datetime(2020, 7, 5, 20, 30, 0),
datetime.datetime(2020, 7, 4, 16, 50, 0),
datetime.datetime(2020, 7, 3, 9, 50, 0),
datetime.datetime(2020, 7, 3, 10, 50, 3),
datetime.datetime(2020, 7, 3, 16, 50, 0),
datetime.datetime(2020, 7, 2, 9, 50, 0),
datetime.datetime(2020, 7, 2, 17, 0, 0),
datetime.datetime(2020, 7, 1, 10, 45, 3),
datetime.datetime(2020, 7, 1, 17, 0, 0),
datetime.datetime(2020, 7, 1, 20, 30, 0)]})
I need to sort it by the following conditions:
time >= 10
hours);time >=10
hours);16:50
or 20:30
or 17:00
in the above example)I tried to solve it by the following steps:
date_time_df
data frame to date, time columns. So, date_time_df
has 3 columns as like ['date_time', 'date', 'time']
gr_df
as a result of:gr_df = date_time_df.groupby('date')['time'].nunique().reset_index()
gr_df.rename(columns={'time':'count_timeslots'}, inplace=True)
gr_df
and select rows which have only one timeslot:dates_with_one_timeslot = list(gr_df[gr_df['count_timeslots']==1].date_time.unique())
gr_df
and select rows which have slot around 10 hours:req_timeslots = list(date_time_df[date_time_df['time'].dt.strftime('%H')=='10'].date_time.unique())
16:50
or 20:30
or 17:00
)Correct output:
result_lst = [
datetime.datetime(2020, 7, 9, 10, 50, 3),
datetime.datetime(2020, 7, 8, 10, 50, 3),
datetime.datetime(2020, 7, 7, 10, 50, 3),
datetime.datetime(2020, 7, 6, 16, 50, 0),
datetime.datetime(2020, 7, 5, 20, 30, 0),
datetime.datetime(2020, 7, 4, 16, 50, 0),
datetime.datetime(2020, 7, 3, 10, 50, 3),
datetime.datetime(2020, 7, 2, 17, 0, 0),
datetime.datetime(2020, 7, 1, 10, 45, 3)
]
Upvotes: 2
Views: 389
Reputation: 71689
Filter the rows where hour
is greater than or equal to 10
then sort the values by date_time
in ascending order and drop duplicates by date
component
m = df['date_time'].dt.hour >= 10
df['date'] = df['date_time'].dt.date
df[m].sort_values('date_time').drop_duplicates('date')
date_time date
21 2020-07-01 10:45:03 2020-07-01
20 2020-07-02 17:00:00 2020-07-02
17 2020-07-03 10:50:03 2020-07-03
15 2020-07-04 16:50:00 2020-07-04
14 2020-07-05 20:30:00 2020-07-05
11 2020-07-06 16:50:00 2020-07-06
9 2020-07-07 10:50:03 2020-07-07
6 2020-07-08 10:50:03 2020-07-08
2 2020-07-09 10:50:03 2020-07-09
Upvotes: 2