Reputation: 19142
Let's say I have the following dataframe:
df <- tibble(ID = c("A", "A", "A", "B", "C", "C", "D", "D", "D", "D", "E", "E", "E"),
Encounter = c(10, 11, 12, 3, 5, 50, 8, 8, 15, 20, 2, 8, 10),
Item = c("apple", "toy", "bowl", "apple", "mango", "mango", "toy", "brush", "toy", "brush", "brush", "key", "key"))
# A tibble: 13 x 3
ID Encounter Item
<chr> <dbl> <chr>
1 A 10 apple
2 A 11 toy
3 A 12 bowl
4 B 3 apple
5 C 5 mango
6 C 50 mango
7 D 8 toy
8 D 8 brush
9 D 15 toy
10 D 20 brush
11 E 2 brush
12 E 8 key
13 E 10 key
I wish to find out if the Item
in the first Encounter
appears in the subsequent Encounter
.
For example, in A
, the Item
in the first Encounter
is apple
, which does not appear in subsequent Encounter
therefore the output should be FALSE
.
For example, in C
, mango
does appear in subsequent Encounter
, therefore the output should be TRUE
For example, in D
, both toy
and brush
are in the first Encounter
, and they both appears in the subsequent Encounter
, therefore the output should be TRUE
The Item
in the first Encounter
should always be FALSE
.
Here is my desired output for your better understanding:
# A tibble: 13 x 4
ID Encounter Item Output
<chr> <dbl> <chr> <lgl>
1 A 10 apple FALSE
2 A 11 toy FALSE
3 A 12 bowl FALSE
4 B 3 apple FALSE
5 C 5 mango FALSE
6 C 50 mango TRUE
7 D 8 toy FALSE
8 D 8 brush FALSE
9 D 15 toy TRUE
10 D 20 brush TRUE
11 E 2 brush FALSE
12 E 8 key FALSE
13 E 10 key FALSE
I have used dplyr::case_when()
to set the row of min Encounter
to FALSE
(successful)
to set Item
that is NOT in the first Encounter
(successful)
to set Item
that IS in the first Encounter
(FAILED if there are multiple Item
s in first Encounter
)
df %>% group_by(ID) %>%
arrange(ID, Encounter) %>%
mutate(Output = case_when(Encounter == min(Encounter) ~ F,
Item %in% first(Item) ~ T,
!(Item %in% first(Item)) ~ F))
# A tibble: 13 x 4
# Groups: ID [5]
ID Encounter Item Output
<chr> <dbl> <chr> <lgl>
1 A 10 apple FALSE
2 A 11 toy FALSE
3 A 12 bowl FALSE
4 B 3 apple FALSE
5 C 5 mango FALSE
6 C 50 mango TRUE
7 D 8 toy FALSE
8 D 8 brush FALSE
9 D 15 toy TRUE
10 D 20 brush FALSE
11 E 2 brush FALSE
12 E 8 key FALSE
13 E 10 key FALSE
Is there any function that acts like dplyr::first()
, but able to return multiple values that can be used in the case_when()
function or ifelse()
?
For example in D
, I don't know how to output both toy
and brush
so that it can be compared using %in%
.
Sorry for the long question, hope someone can help!
Also, feels like my case_when()
expression is not written in a smart way, please feel free to leave a comment if you have suggestions! Thanks in advance!
Upvotes: 1
Views: 92
Reputation: 887901
We may use duplicated
- the values in 'Encounter' are already arrange
d, if not, do an arrange(ID, Encounter)
before the group_by
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Output = first(Item) %in% Item[-1] & duplicated(Item)) %>%
ungroup
-output
# A tibble: 13 × 4
ID Encounter Item Output
<chr> <dbl> <chr> <lgl>
1 A 10 apple FALSE
2 A 11 toy FALSE
3 A 12 bowl FALSE
4 B 3 apple FALSE
5 C 5 mango FALSE
6 C 50 mango TRUE
7 D 8 toy FALSE
8 D 8 brush FALSE
9 D 15 toy TRUE
10 D 20 brush TRUE
11 E 2 brush FALSE
12 E 8 key FALSE
13 E 10 key FALSE
Upvotes: 1