Reputation: 7403
I have a data-frame like this.
value estimated \
dttm_timezone
2011-12-31 20:10:00 10.7891 0
2011-12-31 20:15:00 11.2060 0
2011-12-31 20:20:00 19.9975 0
2011-12-31 20:25:00 15.9975 0
2011-12-31 20:30:00 10.9975 0
2011-12-31 20:35:00 13.9975 0
2011-12-31 20:40:00 15.9975 0
2011-12-31 20:45:00 11.7891 0
2011-12-31 20:50:00 10.9975 0
2011-12-31 20:55:00 10.3933 0
By using the dttm_timezone column information, I would like to extract all the rows which are just within a day or a week or a month.
I have data of 1 year, so if I select day as the duration I should extract 365 days data separately, if I select month then I should extract a 12 months data separately.
How can I achieve this?
Upvotes: 1
Views: 324
Reputation: 294218
Let's use
import pandas as pd
import numpy as np
tidx = pd.date_range('2010-01-01', '2014-12-31', freq='H', name='dtime')
np.random.seed([3,1415])
df = pd.DataFrame(np.random.rand(len(tidx)), tidx, ['value'])
You can limit to '2010'
like this:
df['2010']
Or
df[df.index.year == 2010]
You can limit to a specific month by:
df['2010-04']
or all Aprils:
df[df.index.month == 4]
You can limit to a specific day:
df['2010-04-28']
all 1:00 pm's:
df[df.index.hour == 13]
range of dates:
df['2011':'2013']
or
df['2011-01-01':'2013-06-30']
There is ton of ways to do this:
df.loc[(df.index.month == 11) & (df.index.hour == 22)]
link ---> The list can go on and on. Please read the docs <--- link
Upvotes: 3