Groupby based on date time column on minute level in pandas

Question

I have a data frame as shown below.

Doctor       Appointment           Booking_ID   
  A          2020-01-18 12:00:00     1 
  A          2020-01-18 12:30:00     2
  A          2020-01-18 13:00:00     3 
  A          2020-01-18 13:00:00     4 
  B          2020-01-18 12:00:00     5 
  B          2020-01-18 12:30:00     6 
  B          2020-01-18 13:00:00     7
  B          2020-01-18 13:00:00     8 
  B          2020-01-18 13:00:00     9 
  B          2020-01-18 16:30:00     10 
  A          2020-01-19 12:00:00     11 
  A          2020-01-19 12:30:00     12 
  A          2020-01-19 13:00:00     13
  A          2020-01-19 13:30:00     14
  A          2020-01-19 14:00:00     15 
  A          2020-01-19 14:00:00     16 
  A          2020-01-19 14:00:00     17 
  A          2020-01-19 14:00:00     18 
  B          2020-01-19 12:00:00     19 
  B          2020-01-19 12:30:00     20
  B          2020-01-19 13:00:00     21
  B          2020-01-19 13:30:00     22 
  B          2020-01-19 14:00:00     23
  B          2020-01-19 13:30:00     24 
  B          2020-01-19 15:00:00     25 
  B          2020-01-18 15:30:00     26

From the above I would like to find out the number of booking on same time for same doctor.

Expected Output:

    Doctor           Appointment     Booking_ID   Number_of_Booking
      A          2020-01-18 12:00:00     1         1
      A          2020-01-18 12:30:00     2         1
      A          2020-01-18 13:00:00     3         2
      A          2020-01-18 13:00:00     4         2
      B          2020-01-18 12:00:00     5         1
      B          2020-01-18 12:30:00     6         1
      B          2020-01-18 13:00:00     7         3
      B          2020-01-18 13:00:00     8         3
      B          2020-01-18 13:00:00     9         3
      B          2020-01-18 16:30:00     10        1
      A          2020-01-19 12:00:00     11        1
      A          2020-01-19 12:30:00     12        1
      A          2020-01-19 13:00:00     13        1
      A          2020-01-19 13:30:00     14        1
      A          2020-01-19 14:00:00     15        4
      A          2020-01-19 14:00:00     16        4
      A          2020-01-19 14:00:00     17        4
      A          2020-01-19 14:00:00     18        4
      B          2020-01-19 12:00:00     19        1
      B          2020-01-19 12:30:00     20        1 
      B          2020-01-19 13:00:00     21        1
      B          2020-01-19 13:30:00     22        2
      B          2020-01-19 14:00:00     23        2
      B          2020-01-19 13:30:00     24        2 
      B          2020-01-19 14:00:00     25        2
      B          2020-01-18 15:30:00     26        1

Example:

On the time 2020-01-19 13:30:00 doctor B has two bookings as shown below

Doctor       Appointment           Booking_ID
B          2020-01-19 13:30:00     22
B          2020-01-19 13:30:00     24

So the output will be as shown below

 Doctor       Appointment           Booking_ID     Number_of_Booking
    B        2020-01-19 13:30:00     22             2
    B        2020-01-19 13:30:00     24             2

jezrael · Accepted Answer

For first use GroupBy.transform with GroupBy.size:

df['Number_of_Booking']=df.groupby(['Doctor','Appointment'])['Booking_ID'].transform('size')

print (df.head())
  Doctor          Appointment  Booking_ID  Number_of_Booking
0      A  2020-01-18 12:00:00           1                  1
1      A  2020-01-18 12:30:00           2                  1
2      A  2020-01-18 13:00:00           3                  2
3      A  2020-01-18 13:00:00           4                  2
4      B  2020-01-18 12:00:00           5                  1

For second if unique combination of Doctor and Appointment in all data like in sample assign length of DataFrame:

df['Number_of_Booking'] = len(df)
print (df)
  Doctor          Appointment  Booking_ID  Number_of_Booking
0      B  2020-01-19 13:30:00          22                  2
1      B  2020-01-19 13:30:00          24                  2

Groupby based on date time column on minute level in pandas

Answers (1)

Related Questions