gozzilli
gozzilli

Reputation: 8357

`tapply()` to return data frame

I have a dataset with a datetime (POSIXct), a "node" (factor) and and a "c" (numeric) columns, for example:

                 date node           c
1 2011-08-14 10:30:00    2 0.051236000
2 2011-08-14 10:30:00    2 0.081230000
3 2011-08-14 10:31:00    1 0.000000000
4 2011-08-14 10:31:00    4 0.001356337
5 2011-08-14 10:31:00    3 0.001356337
6 2011-08-14 10:32:00    2 0.000000000

I need to take the mean of column "c" for all pairs of "date" and "node", so I did this:

tapply(data$c, list(data$node, data$date), mean)

The result I obtain is what I want, but in a strange structure:

num [1:5, 1:8923] 0 0 0.00092 0.00146 NA ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:5] "1" "2" "3" "4" ...
  ..$ : chr [1:8923] "2011-08-14 10:30:00" "2011-08-14 10:31:00" "2011-08-14 10:32:00" "2011-08-14 10:33:00" ...

Where an example output would be:

  2011-08-17 23:56:00 2011-08-17 23:57:00 2011-08-17 23:58:00
1        4.759077e-05        4.759077e-05        4.759077e-05
2        0.000000e+00        3.875248e-05        1.595690e-04
3        1.134391e-03        1.134391e-03        1.109730e-03
4        4.882813e-04        6.914658e-04        4.955846e-04
5        0.000000e+00        0.000000e+00        0.000000e+00

What I was going for was something like the original structure, with a datetime, the node factor and the "c" value. I cannot figure out how to achieve this. Any help would be appreciated.

Many thanks.

Upvotes: 3

Views: 2694

Answers (3)

John
John

Reputation: 23758

You might try...

aggregate( c ~ node + date, data = data, FUN = mean )

Upvotes: 7

IRTFM
IRTFM

Reputation: 263471

Instead of tapply you want to use ave

data$grp.mean <- ave(data$c, list(data$node, data$date), FUN= mean)

Looking again at this I am wondering if you wanted to have the aggregation done on the basis of "date" in the calendar sense of 24 hours?

If you wanted to use the results you already have (assuming they are named "M") you might want to try :

require(reshape2)
newdf <- melt(t(M))

Upvotes: 4

joran
joran

Reputation: 173697

If you want output that's a data frame with three columns, you probably would benefit from looking at the plyr package (assuming your data are stored in dat):

library(plyr)
ddply(dat,.(date,node),summarise,m = mean(c))

Upvotes: 4

Related Questions