Sebastiano Caliò
Sebastiano Caliò

Reputation: 83

R replicate tapply for every row to which is applied

I have a dataset like this:

       Anno.2013 Giorni.2013 Anno.2014 Giorni.2014 Stagionalità Destagionata2013
1         18         mar        17         mer        Bassa        9.3710954
2          9         mer         5         gio        Bassa        4.6855477
3          9         gio         2         ven        Bassa        4.6855477
4          8         ven         5         sab        Bassa        4.1649313
5          4         sab         2         dom        Bassa        2.0824656
6          2         dom         0         lun        Bassa        1.0412328
7          1         lun         1         mar        Bassa        0.5206164
8          0         mar         0         mer        Bassa        0.0000000
9          2         mer         0         gio        Bassa        1.0412328
10         0         gio         1         ven        Bassa        0.0000000
   Destagionata2014 Settimana2013 Settimana2014
1         9.4463412             1             1
2         2.7783356             1             1
3         1.1113343             1             1
4         2.7783356             1             1
5         1.1113343             1             1
6         0.0000000             1             2
7         0.5556671             2             2
8         0.0000000             2             2
9         0.0000000             2             2
10        0.5556671             2             2

> str(domanda)
'data.frame':   365 obs. of  9 variables:
 $ Anno.2013       : int  18 9 9 8 4 2 1 0 2 0 ...
 $ Giorni.2013     : Factor w/ 7 levels "dom","gio","lun",..: 4 5 2 7 6 1 3 4 5 2 ...
 $ Anno.2014       : int  17 5 2 5 2 0 1 0 0 1 ...
 $ Giorni.2014     : Factor w/ 7 levels "dom","gio","lun",..: 5 2 7 6 1 3 4 5 2 7 ...
 $ Stagionalità    : Factor w/ 2 levels "Alta","Bassa": 2 2 2 2 2 2 2 2 2 2 ...
 $ Destagionata2013: num  9.37 4.69 4.69 4.16 2.08 ...
 $ Destagionata2014: num  9.45 2.78 1.11 2.78 1.11 ...
 $ Settimana2013   : Factor w/ 53 levels "1","2","3","4",..: 1 1 1 1 1 1 2 2 2 2 ...
 $ Settimana2014   : Factor w/ 53 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 2 ...

I would like to divide every row of Destagionata2013 for the mean of Destagionata2013 grouped by Settimana2013. For example:

Destagionata2013[1:6]/mean(Destagionata2013[1:6])

I try to use tapply:

Media_Settimana<-as.vector(tapply(domanda$Anno.2013, domanda$Settimana2013, mean))
Media_Settimana

> Media_Settimana
 [1]  8.333333  5.857143  3.142857  4.285714  6.428571  6.714286 13.714286  3.428571
 [9]  4.000000  3.285714 11.428571  6.285714 11.714286  7.285714 12.142857 12.000000
[17] 16.000000 20.857143 19.428571 23.428571 33.857143 31.000000 31.714286 32.428571
[25] 38.571429 41.000000 36.000000 38.714286 36.714286 39.857143 40.714286 39.857143
[33] 41.714286 41.857143 41.142857 40.571429 40.428571 37.857143 32.714286 19.714286
[41]  9.000000  4.142857  5.857143 16.285714 11.000000  8.428571  4.428571  6.857143
[49]  6.285714  3.857143  7.000000  5.571429 18.500000

But I'am not able to replicate values for every row.

Upvotes: 1

Views: 38

Answers (1)

BrodieG
BrodieG

Reputation: 52657

As MrFlick notes, you need ave instead of tapply as ave automatically recycles 1 length results to the length of the inputs. Here we do what you are trying to do with iris (normalize Sepal.Length by the mean Sepal.Width within each species):

transform(iris, norm.sep.len=Sepal.Length / ave(Sepal.Width, Species, FUN=mean))

Upvotes: 2

Related Questions