nicholas.reichel
nicholas.reichel

Reputation: 2270

Compare only Date between two columns of datatime objects in Pandas DataFrame

I have the following subset of my data, the actual dataset is much larger. I would like to select only the rows where the year month and day are equal between Quote_Time and Last_Trade_Date, regardless of the time being different. Wondering what the best way to do that would be.

                  Quote_Time           Last_Trade_Date
72  2018-06-14T13:41:28.000Z  2018-06-08T19:58:04.000Z
75  2018-06-14T13:56:23.000Z  2018-06-08T19:58:04.000Z
78  2018-06-14T14:11:15.000Z  2018-06-08T19:58:04.000Z
81  2018-06-14T14:26:09.000Z  2018-06-08T19:58:04.000Z
84  2018-06-14T14:41:14.000Z  2018-06-08T19:58:04.000Z

In this small example no rows would be returned, but in the larger dataset there are matches.

Upvotes: 2

Views: 1139

Answers (1)

cs95
cs95

Reputation: 402323

As a prereq, when reading in your data, parse the date columns:

df = pd.read_csv('file.csv', ..., parse_dates=['Quote_Time', 'Last_Trade_Date'])

Now you just need to normalize the dates and compare. Assuming both columns are datetimes, you can do:

df[df['Quote_Time'].dt.normalize() == df['Last_Trade_Date'].dt.normalize()]  

Or,

df[df['Quote_Time'].dt.date == df['Last_Trade_Date'].dt.date]

Another fun solution using nunique (not as practical):

df[df.apply(lambda x: x.dt.normalize(), axis=1).nunique(axis=1) == 1]

Upvotes: 2

Related Questions