Lroy_12374
Lroy_12374

Reputation: 67

How to subtract dates and accounting for null values?

I have a column that calculates the duration of seconds it takes from A to B in format hh:mm:ss. However, A and B may show null values in the data.

Let's say A=05:15:00 and B=naT, then the subtraction will return 5:15 seconds which is misleading and wrong in context due to B being infinity! How can I specify to only subtract columns B from A if both columns are NOT NULL?

This is the code I have:

df['A_to_B']=(df.B-df.A).dt.total_seconds()

Upvotes: 0

Views: 1465

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150735

You can do:

df['A_to_B'] = np.where(df['A'].notna() & df['B'].notna(), 
                        (df['A'] - df['B']).dt.total_seconds(),
                        np.nan)

Sample data

         A        B
0 05:15:00      NaT
1      NaT 00:00:15
2 05:15:00 00:15:00

Output:

         A        B   A_to_B
0 05:15:00      NaT      NaN
1      NaT 00:00:15      NaN
2 05:15:00 00:15:00  18000.0

Upvotes: 0

loohhoo
loohhoo

Reputation: 36

Python does not use null, but it does use a type called None to represent the absence of a value / type. So you would check if df.B and df.A are both not None, perhaps like this:

if (df.A is not None) and (df.B is not None):
    df['A_to_B'] = (df.B-df.A).dt.total_seconds()

Upvotes: 1

Related Questions