Reputation: 3462
In a dataframe I have a vector with some values, and vectors of categories that each value belongs to. I want to apply a function to the values, that operates "by category", so I use tapply. For example, in my case I want to rescale the values within each category.
However, the result of tapply is a list of vectors of the rescaled values, but I need to unify (or "linearize" back) this list, so I can add a column of the rescaled values to my data frame.
I'm looking for a simple way to do that. here is a sample:
x = 1:10
c = factor(c(1,2,1,2,1,2,1,2,1,2))
#I do the rescaling like this:
rescaled = tapply(x,list(c),function(x) as.vector(scale(x)))
# this look like this:
$`1`
[1] -1.2649111 -0.6324555 0.0000000 0.6324555 1.2649111
$`2`
[1] -1.2649111 -0.6324555 0.0000000 0.6324555 1.2649111
# but really, I need to get something like this
[1] -1.2649111 -1.2649111 -0.6324555 -0.6324555 0.0000000 0.0000000
[7] 0.6324555 0.6324555 1.2649111 1.2649111
Any suggestions?
thanks, amit
Upvotes: 2
Views: 1019
Reputation: 108573
Another job for the workhorse ave
. Let me illustrate it with a data frame:
> mydf <- data.frame(x=1:10,myfac=factor(c(1,2,1,2,1,2,1,2,1,2)))
> within(mydf, scaledx <- ave(x,myfac,FUN=scale))
x myfac scaledx
1 1 1 -1.2649111
2 2 2 -1.2649111
3 3 1 -0.6324555
4 4 2 -0.6324555
5 5 1 0.0000000
6 6 2 0.0000000
7 7 1 0.6324555
8 8 2 0.6324555
9 9 1 1.2649111
10 10 2 1.2649111
If you look at ?ave
, it tells you that you can also use a list of factors to do this. If you want to add a column to a dataframe, this is your most concise (albeit not the fastest) bet. In combination with within
you can do both operations in a single line of code.
Upvotes: 7