Multiple-choice in R: how to tidy survey data using dplyr/tidyr?

Question

I have been using gather() from the tidyr R package to tidy my survey data.

I wonder whether there is a way in which to deal with multiple choice questions when tidying data?

This question is not about a specific error, but more about what strategy is most fitting.

Imagine the following tibble:

tb1 <- tribble(~id,~x1,~x2,~x3,~y1,~y2,~z,
               "Harry",1,1,NA,NA,1,"No",
               "Jess",NA,1,1,1,1,"Yes",
               "George",NA,NA,1,NA,1,"No")

When gathering this multiple question result, I get (logically), multiple rows for 'Harry', 'Jess' and 'George':

tb1 %>%
  gather(X,val,x1:x3,-id,-z) %>%
  filter(!is.na(val)) %>%
  select(-val) %>%
  gather(Y,val,y1:y2,-id,-X,-z) %>%
  filter(!is.na(val)) %>%
  select(-val) 

# A tibble: 7 x 4
  id     z     X     Y    
      
1 Jess   Yes   x2    y1   
2 Jess   Yes   x3    y1   
3 Harry  No    x1    y2   
4 Harry  No    x2    y2   
5 Jess   Yes   x2    y2   
6 Jess   Yes   x3    y2   
7 George No    x3    y2

I'm a bit worried about the multiple entries, and was wondering whether there's a good strategy to deal with multiple choice questions of a survey with binary columns that need to be gathered.

In the end, I'd like to be able to plot and analyse the values of various variables: i.e. the amount of times that people selected y2.

It seems that this long format is not practical to analyse this, as the count() will go up for all of Harry's double mentions of y2.

The flow of questions I have regarding this topic is as follows:

Would it be better/easier for analysis to gather multiple responses into a single column?
If yes, how do you do this efficiently?
If no, what are the implications that I have to watch out for in further analysis when I keep the multi-responses in long format?
and how do you incorporate those implications into your code? (Maybe a specific "group" argument for id? Could you show me an example?)

Multiple-choice in R: how to tidy survey data using dplyr/tidyr?

Answers (1)

Related Questions