benson23
benson23

Reputation: 19142

R output multiple row values that belong to minimum of group and compare the values in case_when

Dataset

Let's say I have the following dataframe:

df <- tibble(ID = c("A", "A", "A", "B", "C", "C", "D", "D", "D", "D", "E", "E", "E"), 
             Encounter = c(10, 11, 12, 3, 5, 50, 8, 8, 15, 20, 2, 8, 10), 
             Item = c("apple", "toy", "bowl", "apple", "mango", "mango", "toy", "brush", "toy", "brush", "brush", "key", "key"))

# A tibble: 13 x 3
   ID    Encounter Item 
   <chr>     <dbl> <chr>
 1 A            10 apple
 2 A            11 toy  
 3 A            12 bowl 
 4 B             3 apple
 5 C             5 mango
 6 C            50 mango
 7 D             8 toy  
 8 D             8 brush
 9 D            15 toy  
10 D            20 brush
11 E             2 brush
12 E             8 key  
13 E            10 key   

Criteria

I wish to find out if the Item in the first Encounter appears in the subsequent Encounter.

  1. For example, in A, the Item in the first Encounter is apple, which does not appear in subsequent Encounter therefore the output should be FALSE.

  2. For example, in C, mango does appear in subsequent Encounter, therefore the output should be TRUE

  3. For example, in D, both toy and brush are in the first Encounter, and they both appears in the subsequent Encounter, therefore the output should be TRUE

  4. The Item in the first Encounter should always be FALSE.

Desired output

Here is my desired output for your better understanding:

# A tibble: 13 x 4
   ID    Encounter Item  Output
   <chr>     <dbl> <chr> <lgl> 
 1 A            10 apple FALSE 
 2 A            11 toy   FALSE 
 3 A            12 bowl  FALSE 
 4 B             3 apple FALSE 
 5 C             5 mango FALSE 
 6 C            50 mango TRUE  
 7 D             8 toy   FALSE 
 8 D             8 brush FALSE 
 9 D            15 toy   TRUE  
10 D            20 brush TRUE  
11 E             2 brush FALSE 
12 E             8 key   FALSE 
13 E            10 key   FALSE 

My attempt

I have used dplyr::case_when()

  1. to set the row of min Encounter to FALSE (successful)

  2. to set Item that is NOT in the first Encounter (successful)

  3. to set Item that IS in the first Encounter (FAILED if there are multiple Items in first Encounter)

df %>% group_by(ID) %>% 
  arrange(ID, Encounter) %>% 
  mutate(Output = case_when(Encounter == min(Encounter) ~ F, 
                            Item %in% first(Item) ~ T, 
                            !(Item %in% first(Item)) ~ F))

# A tibble: 13 x 4
# Groups:   ID [5]
   ID    Encounter Item  Output
   <chr>     <dbl> <chr> <lgl> 
 1 A            10 apple FALSE 
 2 A            11 toy   FALSE 
 3 A            12 bowl  FALSE 
 4 B             3 apple FALSE 
 5 C             5 mango FALSE 
 6 C            50 mango TRUE  
 7 D             8 toy   FALSE 
 8 D             8 brush FALSE 
 9 D            15 toy   TRUE  
10 D            20 brush FALSE 
11 E             2 brush FALSE 
12 E             8 key   FALSE 
13 E            10 key   FALSE 

Ultimate question

Is there any function that acts like dplyr::first(), but able to return multiple values that can be used in the case_when() function or ifelse()?

For example in D, I don't know how to output both toy and brush so that it can be compared using %in%.

Sorry for the long question, hope someone can help!

Also, feels like my case_when() expression is not written in a smart way, please feel free to leave a comment if you have suggestions! Thanks in advance!

Upvotes: 1

Views: 92

Answers (1)

akrun
akrun

Reputation: 887901

We may use duplicated - the values in 'Encounter' are already arranged, if not, do an arrange(ID, Encounter) before the group_by

library(dplyr)
df %>% 
   group_by(ID) %>% 
   mutate(Output =   first(Item) %in% Item[-1] & duplicated(Item)) %>%
   ungroup

-output

# A tibble: 13 × 4
   ID    Encounter Item  Output
   <chr>     <dbl> <chr> <lgl> 
 1 A            10 apple FALSE 
 2 A            11 toy   FALSE 
 3 A            12 bowl  FALSE 
 4 B             3 apple FALSE 
 5 C             5 mango FALSE 
 6 C            50 mango TRUE  
 7 D             8 toy   FALSE 
 8 D             8 brush FALSE 
 9 D            15 toy   TRUE  
10 D            20 brush TRUE  
11 E             2 brush FALSE 
12 E             8 key   FALSE 
13 E            10 key   FALSE 

Upvotes: 1

Related Questions