DIFC
DIFC

Reputation: 7

dplyr:case_when/mutate/filter when multiple rows have a common value

A beginner here. I have data on students and courses in which they are either enrolled in, have dropped/withdrew, or never enrolled. Hence I have multiple rows of the same student as they are associated with multiple courses in a given term. My desired output is to define each student (and not each course) as enrolled, dropped/withdrew, or never enrolled to a single row in order to count the number of unique students under each category.

If a student is enrolled in at least one course then they would be defined as an enrolled ("E") student regardless the status of the student's other courses. If a student has dropped/withdrew all their courses then they would be defined as a dropped/withdrawn ("DW") student. If the enrollment status of the course for the student is "NA" this means the student applied but never enrolled in any courses and would be defined as a never enrolled ("NE") student.

Would appreciate the appropriate code to do this.

Current output:

STUDENT_ID  COURSE   COURSE_ENROLLMENT_STATUS
12345       MATH123  ENROLLED
12345       MATH345  DROPPED
45332       NA       NA
53930       ENGL302  DROPPED
53930       MATH312  WITHDREW
53930       ENGN102  DROPPED

Desired output:

STUDENT_ID  STUDENT_TYPE
12345       E
45332       NE
53930       DW

Upvotes: 0

Views: 719

Answers (1)

Pal R.K.
Pal R.K.

Reputation: 118

library(dplyr)

df %>%
  group_by(STUDENT_ID) %>%
  mutate(STUDENT_TYPE = case_when(
    any(COURSE_ENROLLMENT_STATUS == "ENROLLED") ~ "E",
    all(COURSE_ENROLLMENT_STATUS == "DROPPED" | COURSE_ENROLLMENT_STATUS == "WITHDREW") ~ "DW",
    TRUE ~ "NE")) %>%
  select(STUDENT_ID, STUDENT_TYPE) %>% distinct()

Upvotes: 1

Related Questions