Reputation: 13345
On my attempt to learn dplyr
, I want to divide each row by another row, representing the corresponding group's total.
I generated test data with
library(dplyr)
# building test data
data("OrchardSprays")
totals <- OrchardSprays %>% group_by(treatment) %>%
summarise(decrease = sum(decrease))
totals$decrease <- totals$decrease + seq(10, 80, 10)
totals$rowpos = totals$colpos <- "total"
df <- rbind(OrchardSprays, totals)
Note the line totals$decrease <- totals$decrease + seq(10, 80, 10)
: for the sake of the question, I assumed there was an additional decrease
for each treatment
, which was not observed in the single lines of the data frame but only in the "total" lines for each group.
What I now want to do is adding another column decrease_share
to the data frame where each line's decrease
value is divided by the corresponding treatment
groups total decrease value
.
So, for head(df)
I would expect an output like this
> head(df)
decrease rowpos colpos treatment treatment_decrease
1 57 1 1 D 0.178125
2 95 2 1 E 0.1711712
3 8 3 1 B 0.09876543
4 69 4 1 H 0.08603491
5 92 5 1 G 0.1488673
6 90 6 1 F 0.1470588
My real world example is a bit more complex (more group variables and also more levels), therefore I am looking for a suitable solution in dplyr
.
Upvotes: 0
Views: 152
Reputation: 24878
Here's a total dplyr
approach:
library(dplyr) #version >= 1.0.0
OrchardSprays %>%
group_by(treatment) %>%
summarise(decrease = sum(decrease)) %>%
mutate(decrease = decrease + seq(10, 80, 10),
rowpos = "total",
colpos = "total") %>%
bind_rows(mutate(OrchardSprays, across(rowpos:colpos, as.character))) %>%
group_by(treatment) %>%
mutate(treatment_decrease = decrease / decrease[rowpos == "total"])
# A tibble: 72 x 5
# Groups: treatment [8]
treatment decrease rowpos colpos treatment_decrease
<fct> <dbl> <chr> <chr> <dbl>
1 A 47 total total 1
2 B 81 total total 1
3 C 232 total total 1
4 D 320 total total 1
5 E 555 total total 1
6 F 612 total total 1
7 G 618 total total 1
8 H 802 total total 1
9 D 57 1 1 0.178
10 E 95 2 1 0.171
# … with 62 more rows
Upvotes: 1