Reputation: 2270
I have the following subset of my data, the actual dataset is much larger. I would like to select only the rows where the year month and day are equal between Quote_Time and Last_Trade_Date, regardless of the time being different. Wondering what the best way to do that would be.
Quote_Time Last_Trade_Date
72 2018-06-14T13:41:28.000Z 2018-06-08T19:58:04.000Z
75 2018-06-14T13:56:23.000Z 2018-06-08T19:58:04.000Z
78 2018-06-14T14:11:15.000Z 2018-06-08T19:58:04.000Z
81 2018-06-14T14:26:09.000Z 2018-06-08T19:58:04.000Z
84 2018-06-14T14:41:14.000Z 2018-06-08T19:58:04.000Z
In this small example no rows would be returned, but in the larger dataset there are matches.
Upvotes: 2
Views: 1139
Reputation: 402323
As a prereq, when reading in your data, parse the date columns:
df = pd.read_csv('file.csv', ..., parse_dates=['Quote_Time', 'Last_Trade_Date'])
Now you just need to normalize the dates and compare. Assuming both columns are datetimes, you can do:
df[df['Quote_Time'].dt.normalize() == df['Last_Trade_Date'].dt.normalize()]
Or,
df[df['Quote_Time'].dt.date == df['Last_Trade_Date'].dt.date]
Another fun solution using nunique
(not as practical):
df[df.apply(lambda x: x.dt.normalize(), axis=1).nunique(axis=1) == 1]
Upvotes: 2