Reputation: 623
I have the following dataframe:
dates,rr,ax,be
2018-01-01 00:00:00,45.73,47.63,45.83
2018-01-01 01:00:00,44.16,44.42,44.59
2018-01-01 02:00:00,42.24,42.34,40.22
2018-01-01 03:00:00,39.29,38.36,37.31
2018-01-01 04:00:00,36.0,36.87,32.88
2018-01-01 05:00:00,41.99,39.79,39.27
2018-01-01 06:00:00,42.25,42.08,43.62
2018-01-01 07:00:00,44.97,51.19,49.69
2018-01-01 08:00:00,45.0,59.69,49.98
2018-01-01 09:00:00,44.94,56.67,48.04
2018-01-01 10:00:00,45.04,53.54,46.85
2018-01-01 11:00:00,46.67,52.6,47.95
2018-01-01 12:00:00,46.99,50.77,46.6
2018-01-01 13:00:00,44.16,50.27,43.02
2018-01-01 14:00:00,45.26,50.64,44.2
2018-01-01 15:00:00,47.84,54.79,47.1
2018-01-01 16:00:00,50.1,60.17,50.83
2018-01-01 17:00:00,54.3,59.47,58.31
2018-01-01 18:00:00,51.91,60.16,63.5
2018-01-01 19:00:00,51.38,70.81,61.9
2018-01-01 20:00:00,49.2,62.65,59.62
2018-01-01 21:00:00,45.73,59.71,52.84
2018-01-01 22:00:00,44.84,50.96,51.43
2018-01-01 23:00:00,38.11,46.52,45.35
2018-01-02 00:00:00,19.19,49.62,41.61
2018-01-02 01:00:00,14.99,45.05,40.78
2018-01-02 02:00:00,11.0,45.18,39.59
then I group according to the hour in the index as follows:
df1 = df.assign(hour=lambda x: x.index.hour).groupby('hour')
at this point, I would like to compute the mean of each group. I do as follow:
aa = df1['hour'].mean()
This seems to not working. What am I missing? Thanks, Diego
Upvotes: 0
Views: 42
Reputation: 863166
Solution looks good, should by simplify:
aa = df.groupby(df.index.hour.rename('hour')).mean()
print (aa)
rr ax be
hour
0 32.460 48.625 43.720
1 29.575 44.735 42.685
2 26.620 43.760 39.905
3 39.290 38.360 37.310
4 36.000 36.870 32.880
5 41.990 39.790 39.270
6 42.250 42.080 43.620
7 44.970 51.190 49.690
8 45.000 59.690 49.980
9 44.940 56.670 48.040
10 45.040 53.540 46.850
11 46.670 52.600 47.950
12 46.990 50.770 46.600
13 44.160 50.270 43.020
14 45.260 50.640 44.200
15 47.840 54.790 47.100
16 50.100 60.170 50.830
17 54.300 59.470 58.310
18 51.910 60.160 63.500
19 51.380 70.810 61.900
20 49.200 62.650 59.620
21 45.730 59.710 52.840
22 44.840 50.960 51.430
23 38.110 46.520 45.350
In your solution:
df1 = df.assign(hour=lambda x: x.index.hour).groupby('hour')
#removed ['hour'] for get means of all numeric columns
aa = df1.mean()
Upvotes: 3
Reputation: 10624
Your dates column is not your index and it's also a string column. You can try the following:
df['hour']=df.dates.apply(lambda x: x[11:13])
df1=df.groupby('hour').mean()
del df['hour']
Output:
>>> print(df1)
rr ax be
hour
00 32.460 48.625 43.720
01 29.575 44.735 42.685
02 26.620 43.760 39.905
03 39.290 38.360 37.310
04 36.000 36.870 32.880
05 41.990 39.790 39.270
06 42.250 42.080 43.620
07 44.970 51.190 49.690
08 45.000 59.690 49.980
09 44.940 56.670 48.040
10 45.040 53.540 46.850
11 46.670 52.600 47.950
12 46.990 50.770 46.600
13 44.160 50.270 43.020
14 45.260 50.640 44.200
15 47.840 54.790 47.100
16 50.100 60.170 50.830
17 54.300 59.470 58.310
18 51.910 60.160 63.500
19 51.380 70.810 61.900
20 49.200 62.650 59.620
21 45.730 59.710 52.840
22 44.840 50.960 51.430
23 38.110 46.520 45.350
Upvotes: 0