Yehuda
Yehuda

Reputation: 1893

Select just month and date in datetime.date

I'm working with the NYC MVA dataset. I've combined the CRASH DATE and CRASH TIME columns into a single column with format 2017-06-26 22:00:00. I'd now like to add a categorical column based on seasons. In order to do this, I'm looking to apply a mask to each season name and fill the column based on that, using the following basic structure:

df[df['CRASH TIME'].dt.date < dt.date(:,1,2)]

The problem is that the datetime date Timestamp requires a year input; the dataset spans a number of years. I would like to select all years, not any given year. In other words, I'd like to just select the month and date, and not the year. Is there a way to do that using datetime Timestamps?

Upvotes: 1

Views: 48

Answers (2)

Leo Bogod
Leo Bogod

Reputation: 419

assuming your using pandas for manipulating the data you could do something like this

df['day'] = df['CRASH TIME'].apply(lambda r:r.day)
df['month'] = df['CRASH TIME'].apply(lambda r:r.month)

Then you can combine them or work with them just as they are.

Upvotes: 2

yedpodtrzitko
yedpodtrzitko

Reputation: 9359

I'm not sure there's a way how to directly compare only a part of date, but you can extract the month and day into a tuple and compare them that way:

month_day_left = (df['CRASH TIME'].dt.date.month, df['CRASH TIME'].dt.date.day)
month_day_right = (dt.date.month, dt.date.day)

(2, 1) < (2, 2) # True
(1, 10) < (2, 1) # True
(2, 1) < (1, 30) # False

you can eventually wrap this comparison into a custom function and use it that way:

df[ is_earlier(df['CRASH TIME'].dt, dt)]

Upvotes: 1

Related Questions