Reputation: 21212
A dataframe:
mydf <- data.frame(
x = rep(letters[1:3], 4),
y = rnorm(12, 0, 3)
)
I can easily mutate a new column z that is the value of y plus or minus a random number:
mydf <- mydf %>%
mutate(z = rnorm(nrow(.), mean = 0, sd = sd(y)))
What I wouldlike to do is create z as a random number but when setting the sd use the sd for that letter only.
Tried:
mydf <- mydf %>%
group_by(x) %>%
mutate(z = rnorm(nrow(.), mean = 0, sd = sd(y)))
Error: Problem with `mutate()` input `z`.
x Input `z` can't be recycled to size 4.
ℹ Input `z` is `rnorm(nrow(.), mean = 0, sd = sd(y))`.
ℹ Input `z` must be size 4 or 1, not 12.
ℹ The error occurred in group 1: x = "a".
How can I add z, which is the value of y plus or minus a random number with an sd equal to that of the sd for the group as opposed to the column as a whole?
Upvotes: 1
Views: 329
Reputation: 887088
Here the nrow(.)
will break the group by attribute and get the entire row and mutate
requires the length
of the new the column to be the same as the number of rows of the earlier data. So, this will break that stream unless we wrap the column in a list
which may not be what the OP wanted.
library(dplyr)
mydf %>%
group_by(x) %>%
summarise(n = nrow(.))
# A tibble: 3 x 2
# x n
# <chr> <int>
#1 a 12 ###
#2 b 12 ###
#3 c 12 ###
We can use n()
mydf %>%
group_by(x) %>%
mutate(z = rnorm(n(), mean = 0, sd = sd(y)))
Upvotes: 1