user3494047
user3494047

Reputation: 1693

Pandas filter dataframe rows with a specific year

I have a dataframe df and it has a Date column. I want to create two new data frames. One which contains all of the rows from df where the year equals some_year and another data frame which contains all of the rows of df where the year does not equal some_year. I know you can do df.ix['2000-1-1' : '2001-1-1'] but in order to get all of the rows which are not in 2000 requires creating 2 extra data frames and then concatenating/joining them.

Is there some way like this?

include = df[df.Date.year == year]
exclude = df[df['Date'].year != year]

This code doesn't work, but is there any similar sort of way?

Upvotes: 36

Views: 94492

Answers (2)

Vaishali
Vaishali

Reputation: 38415

You can use datetime accesor.

import datetime as dt
df['Date'] = pd.to_datetime(df['Date'])

include = df[df['Date'].dt.year == year]
exclude = df[df['Date'].dt.year != year]

Upvotes: 72

jezrael
jezrael

Reputation: 862511

You can simplify it by inverting mask by ~ and for condition use Series.dt.year with int for cast string year:

mask = df['Date'].dt.year == int(year)
include = df[mask]
exclude = df[~mask]

Upvotes: 15

Related Questions