Reputation: 1644
I have a dataframe that looks as follows:
df <- read.table(header = TRUE, text =
"STUD_ID MEAS VALUE
1 LVEF(M-M) 69
1 LVEF(2D) 66
2 LVEF(2D) 36
2 LVEF(2D) 72
2 IVSD 63
3 LVEF(M-M) 50
4 LVEF(2D) 71
4 PASP 55", stringsAsFactors = FALSE)
I want to focus on MEAS
that are either LVEF(M-M)
or LVEF(2D)
.
STUD_ID
has VALUE
s that correspond to both LVEF(M-M)
and LVEF(2D)
, then we only filter for the latter i.e. LVEF(2D)
.STUD_ID
has 2 VALUE
s that correspond to LVEF(2D)
, then we take the mean.My desired output is:
# STUD_ID MEAS VALUE
# 1 LVEF(2D) 66
# 2 LVEF(2D) 54
# 2 IVSD 63
# 3 LVEF(M-M) 50
# 4 LVEF(2D) 71
# 4 PASP 55
I tried the following but it gave me an error:
df %>%
filter(MEAS == "LVEF(M-M)" | MEAS == "LVEF(2D)") %>%
arrange(STUD_ID, MEAS) %>%
group_by(STUD_ID, MEAS) %>%
mutate(n = n()) %>%
group_by(STUD_ID) %>%
mutate(nd = n_distinct(MEAS)) %>%
mutate(VALUE =
case_when(nd == 2 ~ VALUE[which(MEAS == "LVEF(2D)")],
nd == 1 & n > 1 ~ mean(VALUE),
TRUE ~ VALUE)) %>%
bind_rows(
df %>% filter(MEAS != "LVEF(M-M)" & MEAS != "LVEF(2D)")
)
My dataframe contains other variables apart from this 3 variables and I wish to retain them.
Upvotes: 0
Views: 80
Reputation: 18425
This does what you are looking for...
df2 <- df %>% group_by(STUD_ID,MEAS) %>%
summarise(VALUE=mean(VALUE)) %>%
group_by(STUD_ID) %>% mutate(TEMP2D=("LVEF(2D)" %in% MEAS)) %>%
filter(!(MEAS=="LVEF(M-M)" & TEMP2D)) %>%
select(-TEMP2D)
df2
# A tibble: 6 x 3
# Groups: STUD_ID [4]
STUD_ID MEAS VALUE
<int> <chr> <dbl>
1 1 LVEF(2D) 66
2 2 IVSD 63
3 2 LVEF(2D) 54
4 3 LVEF(M-M) 50
5 4 LVEF(2D) 71
6 4 PASP 55
Upvotes: 1