Reputation: 8790
Following this discussion, you can group date by the time of day rather than the datetime itself:
import pandas as pd
import numpy as np
dr = pd.date_range('01-01-2020', '01-03-2020', freq='30T')
df = pd.DataFrame(np.random.rand(len(dr)), index=dr)
df = df.groupby([df.index.hour, df.index.minute]).sum()
This aggregates the data as expected, but returns a MultiIndex DataFrame, with unlabeled hours at one level and unlabeled minutes in another:
0
0 0 1.383863
30 1.293589
1 0 0.545617
30 0.208717
2 0 0.464892
30 0.392486
...
I want to create a new single index which represents the times either as strings or as actual datetime.time()
. I can get the times into a string, doing something like the following:
hours = df.index.get_level_values(0).astype(str).str.pad(2, fillchar='0')
minutes = df.index.get_level_values(1).astype(str).str.pad(2, fillchar='0')
new = hours + ':' + minutes
df = df.set_index(new, drop=True)
df.index.name = 'time'
Output:
0
time
00:00 1.203189
00:30 1.943932
01:00 1.671113
01:30 1.640122
02:00 1.240969
02:30 0.687489
...
This works but is a very cumbersome set of commands. My questions are:
datetime.time()
? I know I could convert the original index with df.index = df.index.time
, but I haven't found a way to take the hours from one level of index and minutes from another level of index to get time
objects.Upvotes: 0
Views: 248
Reputation: 16147
To group into strings:
import pandas as pd
import numpy as np
dr = pd.date_range('01-01-2020', '01-03-2020', freq='30T')
df = pd.DataFrame(np.random.rand(len(dr)), index=dr)
df = df.groupby(df.index.strftime('%H:%M')).sum()
df.index.name = 'time'
Or to group into datetime.time
, replace the groupby
line with:
df = df.groupby(df.index.strftime(df.index.time)).sum()
Upvotes: 1