Joan C
Joan C

Reputation: 311

R: How to replace rows in a table by mean values

Sorry, I am probably using the wrong search terms but I couldn't find a solution.

Given an experiment with two participants (id), each performing a task 6 times under two varying parameters (par1,par2):

id <- c(rep(1,6),rep(2,6)) 
par1 <- c(rep("a",9),rep("b",3))
par2 <- c(rep("c",3),rep("d",9))
val <- rnorm(12)
data <- data.frame(id,par1,par2,val)

How can I replace all rows with identical values for "id","par1" and "par2" by a single row in which the value of "val" is the mean of the "val" values of the replaced rows?

The outcome is thus a table like this:

id par1 par2 val
1   a    c   (mean of row 1-3)
1   a    d   (mean of row 4-6)
2   a    d   (mean of row 7-9)
2   b    d   (mean of row 10-12)

Upvotes: 2

Views: 96

Answers (2)

akrun
akrun

Reputation: 886938

Here is an option with data.table. Convert the 'data.frame' to 'data.table' (setDT(data)), grouped by 'id', 'par1', 'par2', get the mean of 'val'

library(data.table)
setDT(data)[, .(val = mean(val)), by = .(id, par1, par2)]

Upvotes: 1

Samuel
Samuel

Reputation: 3053

For a dplyr approach:

library(dplyr)

set.seed(123)  # for reproducibility

id <- c(rep(1, 6), rep(2, 6))
par1 <- c(rep("a", 9), rep("b", 3))
par2 <- c(rep("c", 3), rep("d", 9))
val <- rnorm(12)
data <- data.frame(id, par1, par2, val)

# group by all variables except `val`
data %>% group_by_at(vars(-val)) %>% summarize(val = mean(val))

Which gives:

# A tibble: 4 x 4
# Groups:   id, par1 [?]
     id   par1   par2        val
  <dbl> <fctr> <fctr>      <dbl>
1     1      a      c  0.2560184
2     1      a      d  0.6382870
3     2      a      d -0.4969993
4     2      b      d  0.3794112

Upvotes: 2

Related Questions