DramboHero
DramboHero

Reputation: 1103

Python pandas - add lambda to each column

I have a dataframe that looks like this:

DayOfWeek  Sunday  Monday  Tuesday  Wednesday  Thursday  Friday  Saturday
00            0.0     0.0      0.0       19.0       0.0     4.0       0.0
01            0.0     0.0      0.0        0.0       0.0     7.0       0.0
07            0.0     0.0      3.0        5.0       3.0     0.0       1.0
08            0.0    17.0     16.0        8.0      10.0     1.0       0.0
09           10.0    48.0     30.0       86.0      12.0     3.0       0.0
10           70.0    58.0      3.0       36.0      52.0    70.0       0.0
11           32.0    26.0      0.0       20.0      38.0    42.0       0.0
12           21.0     9.0     83.0       32.0     129.0    57.0       0.0
13           53.0    51.0     55.0       36.0      18.0    32.0       0.0
14           64.0    62.0     24.0       21.0      53.0    61.0       0.0
15           46.0   121.0     37.0       31.0      58.0    54.0       0.0
16           95.0   139.0     86.0       58.0      79.0    11.0       0.0
17          113.0    56.0     73.0      146.0      78.0    17.0       0.0

and I want to make it as precentage, so I want to sum each column, and in each cell I want to divide in the sum of the column so I did this code:

df_day = df_day.apply(lambda x: round(100 * x / df_day.groupby('DayOfWeek').size().sum()))

but it doesn't work...

any ideas please?

Upvotes: 1

Views: 537

Answers (1)

jezrael
jezrael

Reputation: 862641

I think you need divide by div summed columns by sum, then multiple by mul and if necessary round:

print (df_day.sum())
Sunday       504.0
Monday       587.0
Tuesday      410.0
Wednesday    498.0
Thursday     530.0
Friday       359.0
Saturday       1.0
dtype: float64

print (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
           Sunday  Monday  Tuesday  Wednesday  Thursday  Friday  Saturday
DayOfWeek                                                                
0             0.0     0.0      0.0        4.0       0.0     1.0       0.0
1             0.0     0.0      0.0        0.0       0.0     2.0       0.0
7             0.0     0.0      1.0        1.0       1.0     0.0     100.0
8             0.0     3.0      4.0        2.0       2.0     0.0       0.0
9             2.0     8.0      7.0       17.0       2.0     1.0       0.0
10           14.0    10.0      1.0        7.0      10.0    19.0       0.0
11            6.0     4.0      0.0        4.0       7.0    12.0       0.0
12            4.0     2.0     20.0        6.0      24.0    16.0       0.0
13           11.0     9.0     13.0        7.0       3.0     9.0       0.0
14           13.0    11.0      6.0        4.0      10.0    17.0       0.0
15            9.0    21.0      9.0        6.0      11.0    15.0       0.0
16           19.0    24.0     21.0       12.0      15.0     3.0       0.0
17           22.0    10.0     18.0       29.0      15.0     5.0       0.0

Slowier solution with apply:

print (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))

Timings:

In [171]: %timeit (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
1000 loops, best of 3: 1.89 ms per loop

In [172]: %timeit (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))
100 loops, best of 3: 5.18 ms per loop

Upvotes: 3

Related Questions