spidermarn
spidermarn

Reputation: 939

R how to summarize the % breakdown of a column by other columns

I have a dataframe like this:

VisitID | No_Of_Visits | Store A | Store B | Store C | Store D|
   A1   |  1           |   1     |  0      |  0      |  0     |
   B1   |  2           |   1     |  0      |  0      |  1     |
   C1   |  4           |   1     |  2      |  1      |  0     |
   D1   |  3           |   2     |  0      |  1      |  0     |
   E1   |  4           |   1     |  1      |  1      |  1     |

In R how can I summarize the Dataframe to show the % of visits of each Store Category by Visit count lvl? Expected result:

| No_Of_Visits | Store A | Store B | Store C | Store D|
|  1           |   100%  |  0      |  0      |  0     |
|  2           |   50%   |  0      |  0      |  50%   |
|  3           |   67%   |  0%     |  33%    |  0     |
|  4           |   25%   |  38%    |  25%    |  13%   |

I'm thinking of group_by(No_Of_Visits) and mutate_all?

Upvotes: 0

Views: 180

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389047

We can get the data in long format and calculate the sum for each No_Of_Visits and Store and then calculate their ratio before getting the data to wide format.

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = starts_with('Store')) %>%
  group_by(No_Of_Visits, name) %>%
  summarise(value = sum(value)) %>%
  mutate(value = round(value/sum(value) * 100, 2)) %>%
  pivot_wider()

#  No_Of_Visits Store.A Store.B Store.C Store.D
#         <int>   <dbl>   <dbl>   <dbl>   <dbl>
#1            1   100       0       0       0  
#2            2    50       0       0      50  
#3            3    66.7     0      33.3     0  
#4            4    25      37.5    25      12.5

Upvotes: 1

Related Questions