Reputation: 747
I have a pandas data frame with a column containing datetime values. I used pd.to_datetime() to convert the values to datetime objects. I want to compare these values to a single datetime variable such as TODAY. I got TODAY from datetime.date.today(). Here is my code
data['date'] = pd.to_datetime(data['date'])
data['choose'] = data['date'] > datetime.date.today()
However, this does not work and I got this error:
TypeError: Invalid comparison between dtype=datetime64[ns] and date
I got another way which is to create a column that contains TODAY for every single row.
data['today'] = datetime.date.today()
data['today'] = pd.to_datetime(data['today'])
data['choose'] = data['date'] > data['today']
But this is inefficient as it takes up memory as it creates another column. What would be the most efficient way to achieve this?
Upvotes: 0
Views: 1115
Reputation: 62383
.dt.date
will convert the dataframe series to datetime.date
, which can be compared to date.today()
.import pandas as pd
from datetime import date, datetime
# setup dataframe
data = {'date': pd.bdate_range(datetime.today(), periods=15).tolist()}
df = pd.DataFrame(data)
# Boolean
df['choose'] = df['date'].dt.date > date.today()
print(df)
date choose
2020-05-04 False
2020-05-05 True
2020-05-06 True
2020-05-07 True
2020-05-08 True
2020-05-11 True
2020-05-12 True
2020-05-13 True
2020-05-14 True
2020-05-15 True
2020-05-18 True
2020-05-19 True
2020-05-20 True
2020-05-21 True
2020-05-22 True
Upvotes: 1