Reputation: 6324
I have two datetime
columns. One column has all the valid data, and the other column has missing date fields, hence there are some NaT
. The problem at hand is that I need to take the difference between 2 columns, but whenever there is NaT
value in the second column, the resultant column should have very large value say 100000. Here is the example:
Input data:
signup_date bgc_date
2016-01-02 NaT
2016-01-21 NaT
2016-01-11 2016-01-11
2016-01-29 2016-01-30
Expected Output
`signup_date bgc_date difference
2016-01-02 NaT 100000
2016-01-21 NaT 100000
2016-01-11 2016-01-11 0
2016-01-29 2016-01-30 1`
What I tried:
for i in range(len(df2)):
if pd.notnull(df2[i]['bgc_date']):
df2[i]['diff'] = df2[i]['bgc_date']- df2[i]['signup_date']
else:
df2[i]['diff']=10000
However, this is resulting into error. Can someone pls help.
Upvotes: 1
Views: 2139
Reputation: 214927
You can subtract the two columns and then fill missing values with a time delta:
(df.bgc_date - df.signup_date).fillna(pd.Timedelta(days=10000)).dt.days
#0 10000
#1 10000
#2 0
#3 1
#dtype: int64
Upvotes: 1