Reputation: 1105
I have a dataframe with index= datetime.datetime minute by minute. I want to run a loop where for each iteration, I want to just take the data for a given day. is there a better way to do this apart from the following:
data['index_date'] = data['index'].apply(lambda dt: datetime.datetime(dt.year, dt.month, dt.day, 0,0))
days= data['index_date'].unique()
for day is days:
data_day= data[data['index_date']==day]
Just a sample of what "data" df looks like:
>>> data
Out[8]:
index 90 180
2016-01-04 02:30:00-05:00 1.000 1.000
2016-01-04 02:31:00-05:00 1.000 1.000
2016-01-04 02:32:00-05:00 1.000 1.000
2016-01-04 02:33:00-05:00 1.000 1.000
2016-01-04 02:34:00-05:00 1.000 1.000
... ... ...
2016-07-26 12:51:00-04:00 1.000 1.000
2016-07-26 12:52:00-04:00 1.000 1.000
2016-07-26 12:53:00-04:00 1.000 1.000
2016-07-26 12:54:00-04:00 1.000 1.000
2016-07-26 12:55:00-04:00 1.000 1.000
2016-07-26 12:56:00-04:00 1.000 1.000
Upvotes: 2
Views: 702
Reputation: 294526
consider df
n = 10000
df = pd.DataFrame({'index': pd.date_range('2010-01-01', periods=n, freq='T'),
90: np.random.rand(n) * 10,
100: np.random.randn(n) * 100})
Then you can get a dictionary of days
g = df.set_index('index').groupby(pd.TimeGrouper('D'))
d = {k: v for k, v in g}
Or a panel
p = pd.Panel(d)
Or a dataframe
dfg = pd.concat(d.values, keys=d.keys())
Upvotes: 2