Wide to long format with several variables

Question

This question is related to a previous question I asked on converting from wide to long format in R with an additional complication.

previous question is here: Wide to long data conversion

The wide data I start with looks like the following:

d2 <- data.frame('id' = c(1,2),
             'Q1' = c(2,3),
             'Q2' = c(1,3),
             'Q3' = c(3,1),
             'Q1_X_Opt_1' = c(0,0),
             'Q1_X_Opt_2' = c(75,200),
             'Q1_X_Opt_3' = c(150,300),
             'Q2_X_Opt_1' = c(0,0),
             'Q2_X_Opt_2' = c(150,200),
             'Q2_X_Opt_3' = c(75,300),
             'Q3_X_Opt_1' = c(0,0),
             'Q3_X_Opt_2' = c(100,500),
             'Q3_X_Opt_3' = c(150,300))

In this example, there are two individuals who have answered three questions. The answer to each question takes the following values {1,2,3} encoded in Q1, Q2, and Q3. So, in this examples, individual 1 chose option 2 in Q1, chose option 1 in Q2, and chose option 3 in Q3.

For each option there is also a variable X associated with each option that I also need to be converted to wide format. The output I am seeking looks like the following:

    id question option choice cost
1   1        1      1      0    0
2   1        1      2      1   75
3   1        1      3      0  150
4   1        2      1      1    0
5   1        2      2      0  150
6   1        2      3      0   75
7   1        3      1      0    0
8   1        3      2      0  100
9   1        3      3      1  150
10  2        1      1      0    0
11  2        1      2      0  200
12  2        1      3      1  300
13  2        2      1      0    0
14  2        2      2      0  200
15  2        2      3      1  300
16  2        3      1      1    0
17  2        3      2      0  500
18  2        3      3      0  300

I have tried to adapting the code from the answer to the prior question, but with no success thus far. Thanks for any suggestions or comments.

alistaire · Accepted Answer

It's not exactly elegant, but here's a tidyverse version:

library(tidyverse)

d3 <- d2 %>% 
    gather(option, cost, -id:-Q3) %>% 
    gather(question, choice, Q1:Q3) %>% 
    separate(option, c('question2', 'option'), extra = 'merge') %>% 
    filter(question == question2) %>% 
    mutate_at(vars(question, option), parse_number) %>% 
    mutate(choice = as.integer(option == choice)) %>% 
    select(1, 5, 3, 6, 4) %>% 
    arrange(id)

d3
#>    id question option choice cost
#> 1   1        1      1      0    0
#> 2   1        1      2      1   75
#> 3   1        1      3      0  150
#> 4   1        2      1      1    0
#> 5   1        2      2      0  150
#> 6   1        2      3      0   75
#> 7   1        3      1      0    0
#> 8   1        3      2      0  100
#> 9   1        3      3      1  150
#> 10  2        1      1      0    0
#> 11  2        1      2      0  200
#> 12  2        1      3      1  300
#> 13  2        2      1      0    0
#> 14  2        2      2      0  200
#> 15  2        2      3      1  300
#> 16  2        3      1      1    0
#> 17  2        3      2      0  500
#> 18  2        3      3      0  300

Wide to long format with several variables

Answers (2)

Related Questions