flâneur
flâneur

Reputation: 627

R Create multiple rows from 1 row based on presence of values in certain columns

I have a data frame that looks like the following:

ID Date Participant_1 Participant_2 Participant_3 Covariate 1 Covariate 2 Covariate 3

1 9/1      A             B                            16           2           1
2 5/4      B                                          4            2           2
3 6/3      C             A              B             8            3           6
4 2/8      A                                          7            8           4
5 9/3      C             A                            7            1           3

I need to expand this data frame so that a row is present for all of the participants present at each event "ID", with the date and all other variables in all the created rows. The multiple participant columns would now only be one column for participant. The output would therefore be:

ID Date Participant  Covariate 1 Covariate 2 Covariate 3

1 9/1      A               16           2           1
1 9/1      B               16           2           1
2 5/4      B               4            2           2
3 6/3      C               8            3           6
3 6/3      A               8            3           6
3 6/3      B               8            3           6
4 2/8      A               7            8           4
5 9/3      C               7            1           3
5 9/3      A               7            1           3

Is there a way to do this efficiently? Perhaps with a pivot function?

Upvotes: 0

Views: 96

Answers (1)

Jilber Urbina
Jilber Urbina

Reputation: 61214

We can use pivot_longer and then some formatting

library(tidyr)
  df %>% 
    pivot_longer(starts_with("Participant"), values_to = "Participant") %>% 
    select(-name) %>% 
    relocate(Participant, .before = Covariate_1) %>% 
    drop_na()
# A tibble: 9 × 6
     ID Date  Participant Covariate_1 Covariate_2 Covariate_3
  <int> <chr> <chr>             <int>       <int>       <int>
1     1 9/1   A                    16           2           1
2     1 9/1   B                    16           2           1
3     2 5/4   B                     4           2           2
4     3 6/3   C                     8           3           6
5     3 6/3   A                     8           3           6
6     3 6/3   B                     8           3           6
7     4 2/8   A                     7           8           4
8     5 9/3   C                     7           1           3
9     5 9/3   A                     7           1           3

Here's the example data used:

df <- structure(list(ID = 1:5, Date = c("9/1", "5/4", "6/3", "2/8", 
"9/3"), Participant_1 = c("A", "B", "C", "A", "C"), Participant_2 = c("B", 
NA, "A", NA, "A"), Participant_3 = c(NA, NA, "B", NA, NA), Covariate_1 = c(16L, 
4L, 8L, 7L, 7L), Covariate_2 = c(2L, 2L, 3L, 8L, 1L), Covariate_3 = c(1L, 
2L, 6L, 4L, 3L)), class = "data.frame", row.names = c(NA, -5L
))

Upvotes: 1

Related Questions