Jan
Jan

Reputation: 61

Calculate mean across several rows based on ID and condition and reshape into wide-format

I'm trying to figure out a way to calculate a mean across several row values (long format table) for participants and subsequently reshape the data in a wide format. Here is my use case example including missing data (just as my real df):

condition<-c("Participant 1 - Analysis 1","Participant 1 - Analysis 2","Participant 1 - Analysis 3","Participant 1 - Analysis 4","Participant 1 - Analysis 5","Participant 1 - Analysis 6",
             "Participant 2 - Analysis 1","Participant 2 - Analysis 2","Participant 2 - Analysis 3","Participant 2 - Analysis 4","Participant 2 - Analysis 5","Participant 2 - Analysis 6",
             "Participant 3 - Analysis 1","Participant 3 - Analysis 2","Participant 3 - Analysis 3","Participant 3 - Analysis 4","Participant 3 - Analysis 5","Participant 3 - Analysis 6")
value<-c(5,3,5,7,2,3,8,NA,8,2,4,3,NA,NA,NA,NA,NA,NA)
id<-c("d1","d1","d1","d1","d1","d1","d2","d2","d2","d2","d2","d2","d3","d3","d3","d3","d3","d3")
df<-as.data.frame(cbind(condition,value,id))

The solution I am looking for is something like this, including the mean for any cases per condition with at least 2 valid values but also missings (participant 2):

enter image description here

I really appreciate any hints on how to solve this.

Upvotes: 0

Views: 85

Answers (1)

akrun
akrun

Reputation: 887531

We can do

library(dplyr)
library(collapse)
library(tidyr)
 df %>%
   select(-condition) %>%
   group_by(id) %>% 
   mutate(grp2 =as.integer(gl(n(), 2, n()))) %>%
   group_by(grp2, .add = TRUE) %>%
   mutate(value = if(sum(!is.na(value)) >=2) value else NA) %>% 
  ungroup %>% 
  pivot_wider(names_from = grp2, values_from = value, values_fn = fmean)

-output

# A tibble: 3 × 4
  id      `1`   `2`   `3`
  <chr> <dbl> <dbl> <dbl>
1 d1        4     6   2.5
2 d2       NA     5   3.5
3 d3       NA    NA  NA  

Upvotes: 0

Related Questions