eugenego
eugenego

Reputation: 183

Function to store data.frames and calculate mean?

I'm trying to come up with a function and got stuck. I need to run a function (ses.mpd) 1000 times with randomized matrices. The outputs (data.frames) should be stored and then a data.frame with means of those 1000 output data.frames should be calculated.

Example:

output data.frames

        ntaxa  mpd.obs  mpd.rand.mean  mpd.rand.sd
sample1   3       10         9               0.2
sample2   6       15         12              0.6
sample3   4       9          10              0.1


        ntaxa  mpd.obs  mpd.rand.mean  mpd.rand.sd
sample1   6       12         10              0.5
sample2   4       12         15              0.3
sample3   7       4          7               0.3

result data.frame should look like this

       ntaxa  mpd.obs  mpd.rand.mean  mpd.rand.sd
sample1   4.5     11         9.5            0.35
sample2   5       13.5       13.5           0.45
sample3   5.5     6.5        8.5            0.2

I think I have save the 1000 data.frames in a list and then maybe use the ddply function in plyr, but I have not really an idea how to do this.

Upvotes: 0

Views: 120

Answers (1)

Ari B. Friedman
Ari B. Friedman

Reputation: 72741

If all the matrices are the same (e.g. same dimensions and same variable locations), then I would store them in a 3d array and use apply or rowMeans, etc. The latter will be faster.

Using a built-in dataset:

> dim(UCBAdmissions)
[1] 2 2 6
> rowMeans( UCBAdmissions, dims=c(2)  )
          Gender
Admit          Male    Female
  Admitted 199.6667  92.83333
  Rejected 248.8333 213.00000

Upvotes: 3

Related Questions