Reputation: 107
trying to reshape some data tables using tapply. Straight forward enough if you have one factor, one variable, and your desired mathematical function. However I have some datasets where I'd like to reformat with two (or perhaps more) grouping levels.
Consider
x<-1:20 # variable
y<-factor(rep(letters[1:5], each=4)) # first grouping variable
z<-factor(rep(letters[6:7], each=10)) # second grouping variable
tapply(x,z,sum) # summarized table for factor z
f g
55 155
tapply(x,y,sum) # summarized table for factor y
a b c d e
10 26 42 58 74
However, my desired output is would be a table that is something like:
f f f f f g g g g g
a b c d e a b c d e
6 8 10....etc
So, just trying to keep higher level grouping in tables. Sorry if a simple question, I've looked around and can't find anything.
Upvotes: 0
Views: 10816
Reputation: 803
This is my code Ive used on my own data
with(reduced, do.call(rbind, tapply(WR, list(period, no.C),
function(x) c(WR = mean(x), SD = sd(x)))))
reduced = my data frame
WR is the variable I want to calculate the mean from
period is one of my grouping variables. in this case its binary
no.C is another grouping variable - here I have 3 groups
The rest of the equation is the function, but that can easily be replaced by just writing mean (or sum or whatever other statistic you are after) if you only want one value, but I also want it to calculate the standard deviation and I am binding it into a little table that I can print later with the rbind
. Sorry I didn't put the answer into context of your data - but I was confused as to what exactly you wanted.
Basically, in using the list
you can start to create as many grouping values as you want while still using tapply
.
You can also do something similar with aggregate
- see this quick web page for a tidy answer and examples to your question.
with(reduced, aggregate(WR, list(period, no.C), mean))
Upvotes: 3
Reputation: 116
You can use the dplyr package, much easier and much faster if you are dealing with large datasets. However, it only works with data frames.
d <- data.frame(x=x,y=y,z=z)
For the first case:
groups <- group_by(d,z)
summarise(groups,sum(x))
z sum(x)
1 f 55
2 g 155
For the second case:
groups <- group_by(d,y)
summarise(groups,sum(x))
y sum(x)
1 a 10
2 b 26
3 c 42
4 d 58
5 e 74
And for the last case:
groups <- group_by(d,z,y)
summarise(groups,sum(x))
z y sum(x)
1 f a 10
2 f b 26
3 f c 19
4 g c 23
5 g d 58
6 g e 74
Upvotes: 1