amit
amit

Reputation: 3462

R - "linearizing" the results of tapply (to one single vector, unpacked by column)

In a dataframe I have a vector with some values, and vectors of categories that each value belongs to. I want to apply a function to the values, that operates "by category", so I use tapply. For example, in my case I want to rescale the values within each category.

However, the result of tapply is a list of vectors of the rescaled values, but I need to unify (or "linearize" back) this list, so I can add a column of the rescaled values to my data frame.

I'm looking for a simple way to do that. here is a sample:

x = 1:10
c = factor(c(1,2,1,2,1,2,1,2,1,2))
#I do the rescaling like this:
rescaled = tapply(x,list(c),function(x) as.vector(scale(x)))
# this look like this:
$`1`
[1] -1.2649111 -0.6324555  0.0000000  0.6324555  1.2649111

$`2`
[1] -1.2649111 -0.6324555  0.0000000  0.6324555  1.2649111


# but really, I need to get something like this
[1] -1.2649111 -1.2649111 -0.6324555 -0.6324555  0.0000000  0.0000000
 [7]  0.6324555  0.6324555  1.2649111  1.2649111

Any suggestions?

thanks, amit

Upvotes: 2

Views: 1019

Answers (1)

Joris Meys
Joris Meys

Reputation: 108573

Another job for the workhorse ave. Let me illustrate it with a data frame:

> mydf <- data.frame(x=1:10,myfac=factor(c(1,2,1,2,1,2,1,2,1,2)))
> within(mydf, scaledx <- ave(x,myfac,FUN=scale))
    x myfac    scaledx
1   1     1 -1.2649111
2   2     2 -1.2649111
3   3     1 -0.6324555
4   4     2 -0.6324555
5   5     1  0.0000000
6   6     2  0.0000000
7   7     1  0.6324555
8   8     2  0.6324555
9   9     1  1.2649111
10 10     2  1.2649111

If you look at ?ave, it tells you that you can also use a list of factors to do this. If you want to add a column to a dataframe, this is your most concise (albeit not the fastest) bet. In combination with within you can do both operations in a single line of code.

Upvotes: 7

Related Questions