Skipper Lin
Skipper Lin

Reputation: 179

Pandas calculate the time difference in seconds excluding business days

I have a dataset that contains two columns. One is the Message Received Timestamp and another is the Message Sent Timestamp. The format of the time look like such 2021-06-05T07:00:34Z. I would like to calculate the time difference in seconds between those two columns and make sure the difference is less than 24 hours by excluding business days.

I am aware of the np.busday_count method but I would like to calculate the difference in seconds. Please see my current code below which fails to consider the business days: (pd.to_datetime(df["Message Received Timestamp"]) - pd.to_datetime(df["Message Sent Timestamp"])) < pd.Timedelta("1 day")

I would really appreciate your help to factor in the business days calculation (in seconds). Thank you so much.

Upvotes: 1

Views: 484

Answers (1)

SeaBean
SeaBean

Reputation: 23227

You can use Timedelta.total_seconds() to get the total time difference in seconds between the 2 dates as follows:

df["Diff_Total_Seconds"] = (pd.to_datetime(df["Message Received Timestamp"]) - pd.to_datetime(df["Message Sent Timestamp"])).dt.total_seconds() 

Then calculate the total seconds for business days by multiplying np.busday_count with 86400 (seconds in one day):

df["Busday_Total_Seconds"] = (np.busday_count(pd.to_datetime(df["Message Sent Timestamp"]).values.astype('datetime64[D]'), pd.to_datetime(df["Message Received Timestamp"]).values.astype('datetime64[D]')) - 1) * 86400

Finally, get the time difference in seconds excluding business days as their difference.

df["Diff_Total_Seconds_Excl_Busday"] = df["Diff_Total_Seconds"] - df["Busday_Total_Seconds"] 

Test Result:

  Message Received Timestamp Message Sent Timestamp  Diff_Total_Seconds  Busday_Total_Seconds  Diff_Total_Seconds_Excl_Busday
0       2021-06-07T06:41:01Z   2021-06-04T07:00:34Z            258027.0                     0                        258027.0
1       2021-06-09T06:41:01Z   2021-06-08T07:00:34Z             85227.0                     0                         85227.0

Edit

After further discussion with OP, we come up with updated codes with handling of some special cases.

df["Message Received Timestamp"] = pd.to_datetime(df["Message Received Timestamp"])
df["Message Sent Timestamp"] = pd.to_datetime(df["Message Sent Timestamp"])
df["Diff_Total_Sec"] = (df["Message Received Timestamp"] - df["Message Sent Timestamp"]).dt.total_seconds()
df["Diff_Days"] = (df["Message Received Timestamp"].dt.normalize() - df["Message Sent Timestamp"].dt.normalize()).dt.days
df["NonBDay_Total_Sec"] = (df["Diff_Days"] - np.busday_count(df["Message Sent Timestamp"].values.astype('datetime64[D]'), df["Message Received Timestamp"].values.astype('datetime64[D]'))) * 86400
df["NonBDay_Total_Sec"] = np.where((df["Message Received Timestamp"].dt.dayofweek == 6) & (df["NonBDay_Total_Sec"] < 86400 * 2), 86400 * 2, df["NonBDay_Total_Sec"])
df["NonBDay_Total_Sec"] = np.where((df["Message Received Timestamp"].dt.dayofweek == 5) & (df["NonBDay_Total_Sec"] < 86400), 86400, df["NonBDay_Total_Sec"])
df["Diff_Total_Sec_Excl_NonBDay"] = df["Diff_Total_Sec"] - df["NonBDay_Total_Sec"] 
df["Validity_Result"] = (df["Diff_Total_Sec_Excl_NonBDay"] < 86400) & (df["Message Sent Timestamp"] <= df["Message Received Timestamp"])

Test Result:

   Message Received Timestamp    Message Sent Timestamp  Diff_Total_Sec  Diff_Days  NonBDay_Total_Sec  Diff_Total_Sec_Excl_NonBDay  Validity_Result
0   2021-06-07 06:41:01+00:00 2021-06-04 07:00:34+00:00        258027.0          3             172800                      85227.0             True
1   2021-06-07 06:41:01+00:00 2021-06-07 05:41:01+00:00          3600.0          0                  0                       3600.0             True
2   2021-06-07 06:41:01+00:00 2021-06-07 07:41:01+00:00         -3600.0          0                  0                      -3600.0            False
3   2021-06-07 06:41:01+00:00 2021-06-03 06:41:01+00:00        345600.0          4             172800                     172800.0            False
4   2021-06-07 06:41:01+00:00 2021-06-10 06:41:01+00:00       -259200.0         -3                  0                    -259200.0            False
5   2021-06-07 06:41:01+00:00 2021-06-03 07:00:01+00:00        344460.0          4             172800                     171660.0            False
6   2021-06-07 06:41:01+00:00 2021-06-04 07:00:01+00:00        258060.0          3             172800                      85260.0             True
7   2021-06-07 06:41:01+00:00 2021-06-04 05:00:01+00:00        265260.0          3             172800                      92460.0            False
8   2021-06-06 22:01:03+00:00 2021-06-04 09:38:38+00:00        217345.0          2             172800                      44545.0             True
9   2021-06-05 07:01:36+00:00 2021-06-04 03:49:42+00:00         97914.0          1              86400                      11514.0             True
10  2021-06-07 01:10:09+00:00 2021-06-03 03:32:06+00:00        337083.0          4             172800                     164283.0            False
11  2021-06-06 09:03:34+00:00 2021-06-04 08:36:36+00:00        174418.0          2             172800                       1618.0             True

Upvotes: 1

Related Questions