Pandas - DateTime within X amount of minutes from row

Question

I am not entirely positive the best way to ask or phrase this question so I will highlight my problem, dataset, my thoughts on the method and end goal and hopefully it will be clear by the end.

My problem:

My company dispatches workers and will load up dispatches to a single employee even if they are on their current dispatch. This is due to limitation in the software we use. If an employee receives two dispatches within 30 minutes, we call this a double dispatch.

We are analyzing our dispatching efficiency and I am running into a bit of a head scratcher. I need to run through our 100k row database and add an additional column that will read as a dummy variable 1 for double 0 for normal. BUT as we have multiple people we dispatch and B our records do not start ordered by dispatch, I need to determine how often a dispatch occurs to the same person within 30 minutes.

Dataset:

The dataset is incredibly massive due to poor organization in our data warehouse but for terms of what items I need these are the columns I will need for my calc.

Tech Name         |  Dispatch Time (PST)   
John Smith        |  1/1/2017 12:34
Jane Smith        |  1/1/2017 12:46
John Smith        |  1/1/2017 18:32
John Smith        |  1/1/2017 18:50

My Thoughts: How I would do it is clunky and it could work one way but not backwards. I would more or less write my code as:

import pandas as pd

df = pd.read_excel('data.xlsx')
df.sort('Dispatch Time (PST)', inplace = True)

tech_name = None
dispatch_time = pd.to_datetime('1/1/1900 00:00:00')

for index, row in df.iterrows():
    if tech_name is None:
        tech_name = row['Tech Name']
    else:
        if dispatch_time.pd.time_delta('0 Days 00:30:00') > row['Tech Dispatch Time (PST)'] AND row['Tech Name'] = tech_name:
            row['Double Dispatch'] = 1
            dispatch_time = row['Tech Dispatch Time (PST)']
        else:
            dispatch_time = row['Tech Dispatch Time (PST)']
            tech_name = row['Tech Name']

This has many problems from being slow, only tracking dates going backwards and not forwards so I will be missing many dispatches.

End Goal:

My goal is to have a dataset I can then plug back into Tableau for my report by adding on one column that reads as that dummy variable so I can filter and calculate on that.

I appreciate your time and help and let me know if any more details are necessary.

Thank you!

------------------ EDIT ------------- Added a edit to make the question clear as I failed to do so earlier.

Question: Is Pandas the best tool to use to iterate over my dataframe to see each for each datetime dispatch, is there a record that matches the Tech's Name AND is less then 30 minutes away from this record.

If so, how could I improve my algorithm or theory, if not what would the best tool be.

Desired Output - An additional column that records if a dispatch happened within a 30 minute window as a dummy variable 1 for True 0 for False. I need to see when double dispatches are occuring and how many records are true double dispatches, and not just a count that says there were 100 instances of double dispatch, but that involved over 200 records. I need to be able to sort and see each record.

Pandas - DateTime within X amount of minutes from row

Answers (1)

Related Questions