Reputation: 319
In my data
below, First, I'm want to group_by(study)
, and get the mean of X
for each unique study
value and subtract it from each X
value in each study.
Second, and while groupe_by(study)
is still in effect, I want to further group_by(outcome)
within each study
and get the mean of X
for unique outcome
value within a unique study
value and subtract it from each X
value in each outcome
in each study
.
I'm using the following workaround, but it seems it doesn't achieve my goal, because it seems the the group_by(outcome)
call is ignoring the previous group_by(study)
.
Is there a way to achieve what I described above?
library(dplyr)
set.seed(0)
(data <- expand.grid(study = 1:2, outcome = rep(1:2,2)))
data$X <- rnorm(nrow(data))
(data <- arrange(data,study))
# study outcome X
#1 1 1 1.2629543
#2 1 2 1.3297993
#3 1 1 0.4146414
#4 1 2 -0.9285670
#5 2 1 -0.3262334
#6 2 2 1.2724293
#7 2 1 -1.5399500
#8 2 2 -0.2947204
data %>%
group_by(study) %>%
mutate(X_between_st = mean(X), X_within_st = X-X_between_st) %>%
group_by(outcome) %>%
mutate(X_between_ou = mean(X), X_within_ou = X-X_between_ou)
Upvotes: 1
Views: 150
Reputation: 887203
We may use cur_group
data %>%
group_by(study) %>%
summarise(grps = names(cur_group())) %>%
slice(1) %>%
pull(grps)
[1] "study"
Upvotes: 0
Reputation: 388992
Yes, the second group_by
overwrites the previous group_by
which can be checked with group_vars
function.
library(dplyr)
data %>%
group_by(study) %>%
mutate(X_between_st = mean(X), X_within_st = X-X_between_st) %>%
group_by(outcome) %>%
group_vars()
#[1] "outcome"
As you can see at this stage the data is grouped only by outcome
.
You can achieve your goal by including .add = TRUE
in group_by
which will add to the existing groups.
data %>%
group_by(study) %>%
mutate(X_between_st = mean(X), X_within_st = X-X_between_st) %>%
group_by(outcome, .add = TRUE) %>%
group_vars()
#[1] "study" "outcome"
So ultimately, now the code would become -
data %>%
group_by(study) %>%
mutate(X_between_st = mean(X), X_within_st = X-X_between_st) %>%
group_by(outcome, .add = TRUE) %>%
mutate(X_between_ou = mean(X), X_within_ou = X-X_between_ou)
# study outcome X X_between_st X_within_st X_between_ou X_within_ou
# <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 1 1.26 0.520 0.743 0.839 0.424
#2 1 2 1.33 0.520 0.810 0.201 1.13
#3 1 1 0.415 0.520 -0.105 0.839 -0.424
#4 1 2 -0.929 0.520 -1.45 0.201 -1.13
#5 2 1 -0.326 -0.222 -0.104 -0.933 0.607
#6 2 2 1.27 -0.222 1.49 0.489 0.784
#7 2 1 -1.54 -0.222 -1.32 -0.933 -0.607
#8 2 2 -0.295 -0.222 -0.0726 0.489 -0.784
Upvotes: 3