Reputation: 1103
I have a dataframe that looks like this:
DayOfWeek Sunday Monday Tuesday Wednesday Thursday Friday Saturday
00 0.0 0.0 0.0 19.0 0.0 4.0 0.0
01 0.0 0.0 0.0 0.0 0.0 7.0 0.0
07 0.0 0.0 3.0 5.0 3.0 0.0 1.0
08 0.0 17.0 16.0 8.0 10.0 1.0 0.0
09 10.0 48.0 30.0 86.0 12.0 3.0 0.0
10 70.0 58.0 3.0 36.0 52.0 70.0 0.0
11 32.0 26.0 0.0 20.0 38.0 42.0 0.0
12 21.0 9.0 83.0 32.0 129.0 57.0 0.0
13 53.0 51.0 55.0 36.0 18.0 32.0 0.0
14 64.0 62.0 24.0 21.0 53.0 61.0 0.0
15 46.0 121.0 37.0 31.0 58.0 54.0 0.0
16 95.0 139.0 86.0 58.0 79.0 11.0 0.0
17 113.0 56.0 73.0 146.0 78.0 17.0 0.0
and I want to make it as precentage, so I want to sum each column, and in each cell I want to divide in the sum of the column so I did this code:
df_day = df_day.apply(lambda x: round(100 * x / df_day.groupby('DayOfWeek').size().sum()))
but it doesn't work...
any ideas please?
Upvotes: 1
Views: 537
Reputation: 862641
I think you need divide by div
summed columns by sum
, then multiple by mul
and if necessary round
:
print (df_day.sum())
Sunday 504.0
Monday 587.0
Tuesday 410.0
Wednesday 498.0
Thursday 530.0
Friday 359.0
Saturday 1.0
dtype: float64
print (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
DayOfWeek
0 0.0 0.0 0.0 4.0 0.0 1.0 0.0
1 0.0 0.0 0.0 0.0 0.0 2.0 0.0
7 0.0 0.0 1.0 1.0 1.0 0.0 100.0
8 0.0 3.0 4.0 2.0 2.0 0.0 0.0
9 2.0 8.0 7.0 17.0 2.0 1.0 0.0
10 14.0 10.0 1.0 7.0 10.0 19.0 0.0
11 6.0 4.0 0.0 4.0 7.0 12.0 0.0
12 4.0 2.0 20.0 6.0 24.0 16.0 0.0
13 11.0 9.0 13.0 7.0 3.0 9.0 0.0
14 13.0 11.0 6.0 4.0 10.0 17.0 0.0
15 9.0 21.0 9.0 6.0 11.0 15.0 0.0
16 19.0 24.0 21.0 12.0 15.0 3.0 0.0
17 22.0 10.0 18.0 29.0 15.0 5.0 0.0
Slowier solution with apply
:
print (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))
Timings:
In [171]: %timeit (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
1000 loops, best of 3: 1.89 ms per loop
In [172]: %timeit (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))
100 loops, best of 3: 5.18 ms per loop
Upvotes: 3