Nikhil Jain
Nikhil Jain

Reputation: 79

Unable to subtract two datetime columns

I have the following code:

date_today = datetime.datetime.today()
date_today = date_today.strftime('%d-%m-%Y')

df = df[['Email','First Name', 'Last Name', 'Newsletter and Caution', 'Interest','Last Contacted','Relationship']]

df['Last Contacted'] = pd.to_datetime(df['Last Contacted'], format='%Y.%m.%d %H:%M:%S')

df['Last Contacted'] = df['Last Contacted'].dt.strftime('%d-%m-%Y')

df['Last Contacted'] = df['Last Contacted'].apply(pd.Timestamp)

df['Days since contact'] = (date_today - df['Last Contacted']).dt.days

I'm trying to add a column with the numbers of days since last contact obtained by subtracting the last contacted column from today's date.

However I get the following error:

TypeError: unsupported operand type(s) for -: 'DatetimeArray' and 'str'

The code works perfectly in Jupyter but not in Terminal.

What could be the problem?

Upvotes: 0

Views: 985

Answers (3)

rpanai
rpanai

Reputation: 13437

Consider using pandas only

import pandas as pd
df = pd.DataFrame({"LastContacted":pd.date_range(start='2019-01-01', freq="6H", periods=20)})

df["DaysSinceLast"] = (pd.datetime.today() - df["LastContacted"]).dt.days

Upvotes: 0

vlemaistre
vlemaistre

Reputation: 3331

In your second line :

date_today = date_today.strftime('%d-%m-%Y')

you cast your datetime into a string.

strftime means string from time. That is why you have the error when you try to compare your date times with a string in the line :

df['Days since contact'] = (date_today - df['Last Contacted']).dt.days

Upvotes: 2

Serge Ballesta
Serge Ballesta

Reputation: 148870

Here is the culprit:

date_today = date_today.strftime('%d-%m-%Y')

this is enough to turn date_today into a string.

What you want is:

date_today = pd.Timestamp(datetime.date.today())
df = df[['Email','First Name', 'Last Name', 'Newsletter and Caution', 'Interest','Last Contacted','Relationship']]

df['Last Contacted'] = pd.to_datetime(df['Last Contacted'], format='%Y.%m.%d %H:%M:%S')

df['Last Contacted'] = df['Last Contacted'].dt.floor('D')

df['Days since contact'] = (date_today - df['Last Contacted']).dt.days

Avoid as much as you can converting datetimes to string to remove parts. On pandas use floor for periods not greater than a day, and to_period for longer periods.

Upvotes: 2

Related Questions