Reputation: 1693
I have a dataframe df
and it has a Date
column. I want to create two new data frames. One which contains all of the rows from df
where the year equals some_year
and another data frame which contains all of the rows of df
where the year does not equal some_year
. I know you can do df.ix['2000-1-1' : '2001-1-1']
but in order to get all of the rows which are not in 2000 requires creating 2 extra data frames and then concatenating/joining them.
Is there some way like this?
include = df[df.Date.year == year]
exclude = df[df['Date'].year != year]
This code doesn't work, but is there any similar sort of way?
Upvotes: 36
Views: 94492
Reputation: 38415
You can use datetime accesor.
import datetime as dt
df['Date'] = pd.to_datetime(df['Date'])
include = df[df['Date'].dt.year == year]
exclude = df[df['Date'].dt.year != year]
Upvotes: 72
Reputation: 862511
You can simplify it by inverting mask by ~
and for condition use Series.dt.year
with int
for cast string year
:
mask = df['Date'].dt.year == int(year)
include = df[mask]
exclude = df[~mask]
Upvotes: 15