Reputation: 36080
I'm usually using reshape
package to aggregate some data (d'uh), usually with plyr
, because of its uber-awesome function each
. Recently, I received a suggestion to switch to reshape2
and try it out, and now I can't seem to use each
wizardry anymore.
> m <- melt(mtcars, id.vars = c("am", "vs"), measure.vars = "hp")
> cast(m, am + vs ~ variable, each(min, max, mean, sd))
am vs hp_min hp_max hp_mean hp_sd
1 0 0 150 245 194.16667 33.35984
2 0 1 62 123 102.14286 20.93186
3 1 0 91 335 180.83333 98.81582
4 1 1 52 113 80.57143 24.14441
require(plyr)
> m <- melt(mtcars, id.vars = c("am", "vs"), measure.vars = "hp")
> dcast(m, am + vs ~ variable, each(min, max, mean, sd))
Error in structure(ordered, dim = ns) :
dims [product 4] do not match the length of object [16]
In addition: Warning messages:
1: In fs[[i]](x, ...) : no non-missing arguments to min; returning Inf
2: In fs[[i]](x, ...) : no non-missing arguments to max; returning -Inf
I wasn't into mood to comb this down, as my previous code works like a charm with reshape
, but I'd really like to know:
each
with dcast
?reshape2
at all? is reshape
deprecated?Upvotes: 4
Views: 1843
Reputation: 173547
The answer to your first question appears to be no. Quoting from ?reshape2:::dcast
:
If the combination of variables you supply does not uniquely identify one row in the original data set, you will need to supply an aggregating function, fun.aggregate. This function should take a vector of numbers and return a single summary statistic.
A look at Hadley's github page for reshape2 suggests that he knows this functionality was removed, but seems to think it's better done in plyr, presumably with something like:
ddply(m,.(am,vs),summarise,min = min(value),
max = max(value),
mean = mean(value),
sd = sd(value))
or if you really want to keep using each
:
ddply(m,.(am,vs),function(x){each(min,max,mean,sd)(x$value)})
Upvotes: 5