Victor Shin
Victor Shin

Reputation: 319

How can I have repetitive means across rows by groups

I would like to get the mean value, by group when there are two identifiers. Say I have the following dataset:

set.seed(123)
df <- data.frame(id = 1:2,
                 id2 = c("a","b", "c","c", "b","c", "a","b", "c","b"),
                 var1 = runif(10))
                 

I am trying to get the average valeu of 2 groups with data.table. I would like to create another column (avg) with the average values. Which means, the average will repeat itself across rows when it matches with the id and id2. This is what I am trying to do:

setDT(df)[, avg := mean(var1), by=list(id,id2)]

So, just to clarify. There are two values identified with id=1 and id2 = a. The average would be (0.2875775 + 0.5281055)/2 = 0.4078415. I would like this value to repeat itself next to row 1 and row 7, which correspond to id=1 and id2 = a, and successively for all other averages. How can I do this?

Upvotes: 2

Views: 63

Answers (1)

Macosso
Macosso

Reputation: 1439

library(tidyverse)

df %>% 
  group_by(id, id2) %>%
  mutate(avg = mean(var1))

      id id2     var1   avg
   <int> <chr>  <dbl> <dbl>
 1     1 a     0.288  0.408
 2     2 b     0.788  0.712
 3     1 c     0.409  0.480
 4     2 c     0.883  0.464
 5     1 b     0.940  0.940
 6     2 c     0.0456 0.464
 7     1 a     0.528  0.408
 8     2 b     0.892  0.712
 9     1 c     0.551  0.480
10     2 b     0.457  0.712

The code you presented also does the same task as well using data.table. note, that setDT will transform the same df, if you print df you will not that the additional column was created.

Upvotes: 2

Related Questions