Reputation: 237
I have the dataframe:
df <- data.frame(subject=c('x','x','x','y','y','y','z','z','z'),
trial=c(1,2,3,1,2,3,1,2,3),
condition=c('A','A','B','B','B','B','A','A','A'))
I would like to create a list of subjects for which the condition in trial number 1 is A and the condition in trial 3 is B. In the example above, this would be subject x only.
Ideally I would like to do this by grouping by subject, summarizing for each participant first_condition
and third_condition
, and then filtering according to the statement first_condition=='A' & third_condition=='B'
. But I don't know how to extract the condition for a specific trial number when summarizing.
Any ideas? Thanks!
Upvotes: 1
Views: 43
Reputation: 160782
Since the logic is specific pairs of trial
and condition
, it might be useful to inner-join your data with a table of permitted pairs, then finding subjects with all of the trials/conditions present.
library(dplyr)
fltr <- tibble(trial = c(1, 3), condition = c("A", "B"))
fltr
# # A tibble: 2 x 2
# trial condition
# <dbl> <chr>
# 1 1 A
# 2 3 B
We'll do an "inner join", which means that we only retain rows that are present in both sides.
df %>%
inner_join(fltr, by = c("trial", "condition"))
# subject trial condition
# 1 x 1 A
# 2 x 3 B
# 3 y 3 B
# 4 z 1 A
From here, we need to filter those where a subject has both trials:
df %>%
inner_join(fltr, by = c("trial", "condition")) %>%
group_by(subject) %>%
filter(all(c(1, 3) %in% trial)) %>%
ungroup()
# # A tibble: 2 x 3
# subject trial condition
# <chr> <dbl> <chr>
# 1 x 1 A
# 2 x 3 B
Another method is to pivot wider, filter on specific trials, and optionally pivot back to long format (using tidyr
).
The initial pivot-wider:
df %>%
tidyr::pivot_wider(subject, names_from = "trial", names_prefix = "trial_", values_from = "condition")
# # A tibble: 3 x 4
# subject trial_1 trial_2 trial_3
# <chr> <chr> <chr> <chr>
# 1 x A A B
# 2 y B B B
# 3 z A A A
And then a very-readable filter:
df %>%
tidyr::pivot_wider(subject, names_from = "trial", names_prefix = "trial_", values_from = "condition") %>%
filter(trial_1 == "A" & trial_3 == "B")
# # A tibble: 1 x 4
# subject trial_1 trial_2 trial_3
# <chr> <chr> <chr> <chr>
# 1 x A A B
You can convert it back again with:
df %>%
tidyr::pivot_wider(subject, names_from = "trial", names_prefix = "trial_", values_from = "condition") %>%
filter(trial_1 == "A" & trial_3 == "B") %>%
tidyr::pivot_longer(-subject, names_to = "trial", values_to = "condition")
# # A tibble: 3 x 3
# subject trial condition
# <chr> <chr> <chr>
# 1 x trial_1 A
# 2 x trial_2 A
# 3 x trial_3 B
This has the advantage of keeping all trials for that subject, regardless if it was one of the 1/A 3/B pairs we initially filtered on.
Upvotes: 0
Reputation: 18435
I think this is what you are describing...
df %>% group_by(subject) %>%
summarise(first_cond = condition[trial==1],
third_cond = condition[trial==3]) %>%
filter(first_cond == "A",
third_cond == "B")
# A tibble: 1 x 3
subject first_cond third_cond
<chr> <chr> <chr>
1 x A B
This will work provided there is only one condition
for each value of trial
for each subject
.
Upvotes: 1
Reputation: 11596
See if this answers:
> df %>% group_by(subject) %>% filter(trial %in% c(1,3)) %>% ungroup() %>% group_by(subject) %>% filter(length(unique(condition)) == 2)
# A tibble: 2 x 3
# Groups: subject [1]
subject trial condition
<chr> <dbl> <chr>
1 x 1 A
2 x 3 B
>
Upvotes: 0