Grouping Pandas DataFrame by date

Question

I have a Pandas DataFrame that includes a date column. Elements of that column are of type pandas.tslib.Timestamp.

I'd like to group the dataframe by date, but exclude timestamp information that is more granular that date (ie. grouping by date, where all Feb 23, 2011 are grouped). I know how to express this in SQL, but am quite new to Pandas.

This question does something very similar, but I don't understand the code and it uses datetime objects.

From the documentation, I don't even understand how to retrieve the date from a Pandas Timestamp object. I could convert to datetime object, but that seems very roundabout.

As requested, the output of df.head():

    date    show    network timed   session_id
0   2011-12-03 02:48:52  Monk    TV38    670     00003DA9-01D2-E7A9-4177-203BE6A9E2BA    
1   2011-12-03 03:00:09  WBZ News    TV38    205     00003DA9-01D2-E7A9-4177-203BE6A9E2BA
2   2011-12-03 03:04:04  Dateline NBC    NBC     30  00003DA9-01D2-E7A9-4177-203BE6A9E2BA
3   2011-12-03 03:04:35  20/20   ABC     25  00003DA9-01D2-E7A9-4177-203BE6A9E2BA
4   2011-12-03 03:04:56  College Football    FOX     55  00003DA9-01D2-E7A9-4177-203BE6A9E2BA

Andy Hayden · Accepted Answer

You can use the normalize DatetimeIndex method (which takes it to midnight that day):

In [11]: df['date']
Out[11]: 
0   2011-12-03 02:48:52
1   2011-12-03 03:00:09
2   2011-12-03 03:04:04
3   2011-12-03 03:04:35
4   2011-12-03 03:04:56
Name: date, dtype: datetime64[ns]

In [12]: pd.DatetimeIndex(df['date']).normalize()
Out[12]: 

[2011-12-03 00:00:00, ..., 2011-12-03 00:00:00]
Length: 5, Freq: None, Timezone: None

And you can groupby this:

g = df.groupby(pd.DatetimeIndex(df['date']).normalize())

In 0.15 you'll have access to the dt attribute, so can write this as:

g = df.groupby(df['date'].dt.normalize())

Grouping Pandas DataFrame by date

Answers (2)

Related Questions