Reputation: 939
I have a dataframe like this:
VisitID | No_Of_Visits | Store A | Store B | Store C | Store D|
A1 | 1 | 1 | 0 | 0 | 0 |
B1 | 2 | 1 | 0 | 0 | 1 |
C1 | 4 | 1 | 2 | 1 | 0 |
D1 | 3 | 2 | 0 | 1 | 0 |
E1 | 4 | 1 | 1 | 1 | 1 |
In R how can I summarize the Dataframe to show the % of visits of each Store Category by Visit count lvl? Expected result:
| No_Of_Visits | Store A | Store B | Store C | Store D|
| 1 | 100% | 0 | 0 | 0 |
| 2 | 50% | 0 | 0 | 50% |
| 3 | 67% | 0% | 33% | 0 |
| 4 | 25% | 38% | 25% | 13% |
I'm thinking of group_by(No_Of_Visits) and mutate_all?
Upvotes: 0
Views: 180
Reputation: 389047
We can get the data in long format and calculate the sum for each No_Of_Visits
and Store and then calculate their ratio before getting the data to wide format.
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = starts_with('Store')) %>%
group_by(No_Of_Visits, name) %>%
summarise(value = sum(value)) %>%
mutate(value = round(value/sum(value) * 100, 2)) %>%
pivot_wider()
# No_Of_Visits Store.A Store.B Store.C Store.D
# <int> <dbl> <dbl> <dbl> <dbl>
#1 1 100 0 0 0
#2 2 50 0 0 50
#3 3 66.7 0 33.3 0
#4 4 25 37.5 25 12.5
Upvotes: 1