Reputation: 9018
I have the d2 data frame below and I'd like to add a a column that is the average of x by group and flag. But I am adding the flag column in the mutate code and I am not sure how to then add an average by group AND flag.
d = data.frame(x=c(seq(1,5,1),seq(11,15,1),100,1000),group= c(rep("A",5),rep("B",5),"A","B"))
d
d2 = d%>%
group_by(group) %>%
mutate(
U=quantile(x, 0.75) + 1.5*IQR(x),
L=quantile(x, 0.25) - 1.5*IQR(x),
flag = ifelse(x>U | x<L,1,0),
mu = mean(x)
)
as.data.frame(d2)
I've added a "result" vector for what the result should be
x group U L flag mu result
1 1 A 8.5 -1.5 0 19.16667 3
2 2 A 8.5 -1.5 0 19.16667 3
3 3 A 8.5 -1.5 0 19.16667 3
4 4 A 8.5 -1.5 0 19.16667 3
5 5 A 8.5 -1.5 0 19.16667 3
6 11 B 18.5 8.5 0 177.50000 13
7 12 B 18.5 8.5 0 177.50000 13
8 13 B 18.5 8.5 0 177.50000 13
9 14 B 18.5 8.5 0 177.50000 13
10 15 B 18.5 8.5 0 177.50000 13
11 100 A 8.5 -1.5 1 19.16667 100
12 1000 B 18.5 8.5 1 177.50000 1000
note having group_by(group,flag)
returns
Error: unknown variable to group by : flag
Upvotes: 0
Views: 1099
Reputation: 21621
Simply add group_by(group, flag)
to the chain after your initial operations and then mutate()
:
d %>%
group_by(group) %>%
mutate(
U = quantile(x, 0.75) + 1.5 * IQR(x),
L = quantile(x, 0.25) - 1.5 * IQR(x),
flag = ifelse(x > U | x < L, 1, 0),
mu = mean(x)) %>%
group_by(group, flag) %>%
mutate(result = mean(x))
Which gives:
#Source: local data frame [12 x 7]
#Groups: group, flag [4]
#
# x group U L flag mu result
# <dbl> <fctr> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 A 8.5 -1.5 0 19.16667 3
#2 2 A 8.5 -1.5 0 19.16667 3
#3 3 A 8.5 -1.5 0 19.16667 3
#4 4 A 8.5 -1.5 0 19.16667 3
#5 5 A 8.5 -1.5 0 19.16667 3
#6 11 B 18.5 8.5 0 177.50000 13
#7 12 B 18.5 8.5 0 177.50000 13
#8 13 B 18.5 8.5 0 177.50000 13
#9 14 B 18.5 8.5 0 177.50000 13
#10 15 B 18.5 8.5 0 177.50000 13
#11 100 A 8.5 -1.5 1 19.16667 100
#12 1000 B 18.5 8.5 1 177.50000 1000
Upvotes: 2