Reputation: 8357

`tapply()` to return data frame

I have a dataset with a datetime (POSIXct), a "node" (factor) and and a "c" (numeric) columns, for example:

                 date node           c
1 2011-08-14 10:30:00    2 0.051236000
2 2011-08-14 10:30:00    2 0.081230000
3 2011-08-14 10:31:00    1 0.000000000
4 2011-08-14 10:31:00    4 0.001356337
5 2011-08-14 10:31:00    3 0.001356337
6 2011-08-14 10:32:00    2 0.000000000

I need to take the mean of column "c" for all pairs of "date" and "node", so I did this:

tapply(data$c, list(data$node, data$date), mean)

The result I obtain is what I want, but in a strange structure:

num [1:5, 1:8923] 0 0 0.00092 0.00146 NA ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:5] "1" "2" "3" "4" ...
  ..$ : chr [1:8923] "2011-08-14 10:30:00" "2011-08-14 10:31:00" "2011-08-14 10:32:00" "2011-08-14 10:33:00" ...

Where an example output would be:

  2011-08-17 23:56:00 2011-08-17 23:57:00 2011-08-17 23:58:00
1        4.759077e-05        4.759077e-05        4.759077e-05
2        0.000000e+00        3.875248e-05        1.595690e-04
3        1.134391e-03        1.134391e-03        1.109730e-03
4        4.882813e-04        6.914658e-04        4.955846e-04
5        0.000000e+00        0.000000e+00        0.000000e+00

What I was going for was something like the original structure, with a datetime, the node factor and the "c" value. I cannot figure out how to achieve this. Any help would be appreciated.

Many thanks.

Upvotes: 3

Answers (3)

John

Reputation: 23768

You might try...

aggregate( c ~ node + date, data = data, FUN = mean )

Upvotes: 7

IRTFM

Reputation: 263489

Instead of tapply you want to use ave

data$grp.mean <- ave(data$c, list(data$node, data$date), FUN= mean)

Looking again at this I am wondering if you wanted to have the aggregation done on the basis of "date" in the calendar sense of 24 hours?

If you wanted to use the results you already have (assuming they are named "M") you might want to try :

require(reshape2)
newdf <- melt(t(M))

Upvotes: 4

joran

Reputation: 173737

If you want output that's a data frame with three columns, you probably would benefit from looking at the plyr package (assuming your data are stored in dat):

library(plyr)
ddply(dat,.(date,node),summarise,m = mean(c))

Upvotes: 4

`tapply()` to return data frame

Answers (3)

Related Questions