Reputation: 747
I have some plant data almost identical to the 'iris' data set. I would like to simulate new data using a normal distribution. So for each variable~species in the iris data set I would create 10 new observations from a normal distribution. Basically it would just create a new data frame with the same structure as the old one, but it would contain simulated data. I feel that the following code should get me started (I think the data frame would be in the wrong form), but it will not run.
ddply(iris, c("Species"), function(x) data.frame(rnorm(n=10, mean=mean(x), sd=sd(x))))
rnorm is returning an atomic vector so ddply should be able to handle it.
Upvotes: 0
Views: 172
Reputation: 206167
the ddply
will subset the rows by Species, but you're doing nothing in the function to iterate over the columns of the sub-setting data.frame. You cannot get norm() to return a list or data.frame for you; you will need to assist with the shaping. How about
ddply(iris, c("Species"), function(x) {
data.frame(lapply(x[,1:4], function(y) rnorm(10, mean(y), sd(y))))
})
here we calculate new values for the first 4 columns in each group.
Upvotes: 1