Reputation: 419
I used the Following code to generate random date and values:
import pandas as pd
import numpy as np
time = pd.date_range('1/1/2000', periods=2000, freq='5min')
series = pd.Series(np.random.randint(100, size=2000), index=time)
the output looks like this:
2000-01-01 00:00:00 40
2000-01-01 00:05:00 13
2000-01-01 00:10:00 99
2000-01-01 00:15:00 72
2000-01-01 00:20:00 4
2000-01-01 00:25:00 36
2000-01-01 00:30:00 24
2000-01-01 00:35:00 20
2000-01-01 00:40:00 83
2000-01-01 00:45:00 44
Then I sort this data by the index hour value then aggregate it by the average which looks like this:
0 50.380952
1 49.380952
2 49.904762
3 53.273810
4 47.178571
5 46.095238
6 49.047619
7 44.297619
8 53.119048
9 48.261905
10 45.166667
11 54.214286
12 50.714286
13 56.130952
14 50.916667
15 42.428571
16 46.880952
17 56.892857
18 54.071429
19 47.607143
20 50.940476
21 50.511905
22 44.550000
23 50.250000
But what should I do now if I want to group all the data just by the index hour value and without the average so that I can get all the values associated with their index?
Thanks in advance.
Regards,
Upvotes: 3
Views: 4761
Reputation: 863541
If want aggregate by hour
s with means:
np.random.seed(456)
time = pd.date_range('1/1/2000', periods=2000, freq='5min')
series = pd.Series(np.random.randint(100, size=2000), index=time)
s = series.groupby(series.index.hour).mean()
print (s)
0 49.392857
1 52.523810
2 53.047619
3 49.083333
4 49.785714
5 49.071429
6 52.476190
7 47.821429
8 52.190476
9 50.000000
10 49.035714
11 52.988095
12 52.785714
13 52.023810
14 46.964286
15 52.095238
16 51.047619
17 52.166667
18 48.357143
19 51.416667
20 45.214286
21 46.130952
22 49.750000
23 48.527778
dtype: float64
But if need MultiIndex
by hours:
series.index = [series.index.hour, series.index]
print (series)
0 2000-01-01 00:00:00 27
2000-01-01 00:05:00 43
2000-01-01 00:10:00 89
2000-01-01 00:15:00 42
2000-01-01 00:20:00 28
2000-01-01 00:25:00 79
2000-01-01 00:30:00 60
2000-01-01 00:35:00 45
2000-01-01 00:40:00 37
2000-01-01 00:45:00 92
2000-01-01 00:50:00 39
2000-01-01 00:55:00 81
1 2000-01-01 01:00:00 11
2000-01-01 01:05:00 77
2000-01-01 01:10:00 69
2000-01-01 01:15:00 98
...
Then is possible select by hours:
print (series.loc[0])
2000-01-01 00:00:00 27
2000-01-01 00:05:00 43
2000-01-01 00:10:00 89
2000-01-01 00:15:00 42
2000-01-01 00:20:00 28
2000-01-01 00:25:00 79
2000-01-01 00:30:00 60
2000-01-01 00:35:00 45
2000-01-01 00:40:00 37
2000-01-01 00:45:00 92
2000-01-01 00:50:00 39
2000-01-01 00:55:00 81
2000-01-02 00:00:00 82
2000-01-02 00:05:00 69
2000-01-02 00:10:00 99
2000-01-02 00:15:00 17
2000-01-02 00:20:00 59
...
Also if need mean
s with no change DatetimeIndex
:
s1 = series.groupby(series.index.hour).transform('mean')
print (s1)
2000-01-01 00:00:00 49.392857
2000-01-01 00:05:00 49.392857
2000-01-01 00:10:00 49.392857
2000-01-01 00:15:00 49.392857
2000-01-01 00:20:00 49.392857
2000-01-01 00:25:00 49.392857
2000-01-01 00:30:00 49.392857
2000-01-01 00:35:00 49.392857
2000-01-01 00:40:00 49.392857
2000-01-01 00:45:00 49.392857
2000-01-01 00:50:00 49.392857
2000-01-01 00:55:00 49.392857
2000-01-01 01:00:00 52.523810
2000-01-01 01:05:00 52.523810
2000-01-01 01:10:00 52.523810
2000-01-01 01:15:00 52.523810
2000-01-01 01:20:00 52.523810
2000-01-01 01:25:00 52.523810
2000-01-01 01:30:00 52.523810
...
EDIT:
For list per hours use:
s = series.groupby(series.index.hour).apply(list)
print (s)
0 [27, 43, 89, 42, 28, 79, 60, 45, 37, 92, 39, 8...
1 [11, 77, 69, 98, 78, 84, 34, 66, 4, 8, 85, 62,...
2 [16, 41, 10, 72, 44, 35, 48, 51, 99, 53, 22, 3...
3 [56, 22, 74, 85, 81, 6, 44, 44, 49, 43, 95, 11...
4 [21, 90, 89, 76, 62, 20, 66, 50, 68, 79, 69, 4...
5 [51, 85, 31, 58, 97, 10, 91, 25, 4, 11, 94, 28...
6 [5, 71, 62, 57, 62, 87, 12, 41, 43, 47, 25, 15...
7 [84, 17, 26, 32, 14, 76, 72, 35, 8, 60, 79, 27...
8 [15, 30, 80, 53, 10, 97, 71, 83, 37, 44, 89, 1...
9 [58, 20, 98, 77, 75, 26, 63, 26, 24, 62, 93, 6...
10 [39, 61, 92, 43, 61, 73, 86, 64, 26, 0, 75, 11...
11 [24, 13, 13, 54, 50, 38, 22, 46, 67, 15, 29, 4...
12 [21, 56, 16, 63, 46, 79, 11, 85, 87, 18, 66, 9...
13 [10, 89, 66, 80, 60, 2, 6, 19, 77, 81, 38, 48,...
14 [17, 64, 90, 91, 71, 32, 77, 9, 76, 14, 9, 79,...
15 [95, 75, 49, 34, 5, 31, 43, 68, 84, 48, 25, 69...
16 [13, 68, 87, 96, 6, 83, 9, 5, 29, 93, 57, 92, ...
17 [77, 6, 73, 41, 76, 93, 11, 50, 72, 84, 82, 53...
18 [95, 11, 61, 56, 30, 24, 24, 9, 0, 65, 96, 82,...
19 [31, 14, 98, 67, 7, 54, 29, 60, 77, 83, 45, 70...
20 [4, 15, 37, 78, 79, 59, 63, 97, 14, 74, 33, 2,...
21 [88, 69, 31, 20, 41, 10, 41, 6, 36, 27, 63, 49...
22 [4, 90, 70, 66, 92, 46, 54, 47, 6, 54, 62, 80,...
23 [27, 23, 21, 18, 29, 39, 77, 88, 21, 86, 7, 45...
dtype: object
Upvotes: 3