Schwalmski
Schwalmski

Reputation: 1

How do I separate nominal Mulitple Choice answers from one variable into single variables?

My problem is similar to this question. But the solution didn't work for me.

I asked tourists what kind of traveler types they are. It was a multiple-choice question with 6 different categories (Nature, Culture, Health, Active, Family, Other) and they were able to choose more than one type. In Google Sheets and Excel, the answers within one column are separated by commas. So I ended up with a variable that looks kind of like this:

Traveller Type
Family, Nature
Nature
Culture, Nature
Nature, Other
Culture

How do I separate the traveller types into new variables of my big table in RStudio? It should look like this:

Traveller Type Family Nature Culture Health Active Other
Family, Nature 1 1 0 0 0 0
Nature 0 1 0 0 0 0
Culture, Nature 0 1 1 0 0 0
Nature, Other 0 1 0 0 0 1
Culture, Active 0 0 1 0 1 0

I've tried this, but the function "mutate_at" didn't work, and also I didn't really understand how to insert my data.

Upvotes: 0

Views: 218

Answers (2)

Mike Steelson
Mike Steelson

Reputation: 15318

In google sheets you can try this

=sort(query(
  arrayformula(split(flatten( 
    arrayformula(trim(iferror(split(A2:A,",")))) &","& 
    transpose(arrayformula(substitute(query(transpose(B2:G),,9^9)," ",","))) ),",",true,false)),
  "select Col1,sum(Col"&textjoin("),sum(Col",,sequence(1,columns(B1:G1),2,1))&") 
    where Col1 is not null group by Col1 
    label "&textjoin("', ",,arrayformula("sum(Col"&Column(B1:G1)&") '"&B1:G1))&"' "))

enter image description here

Upvotes: 0

Phil
Phil

Reputation: 8107

df <- data.frame(Traveller_Type =
                   c("Family, Nature",
                     "Nature",
                     "Culture, Nature",
                     "Nature, Other",
                     "Culture")) 

library(dplyr)
library(tidyr)

df |> 
  mutate(id = row_number(), value = 1) |> 
  separate_rows(Traveller_Type) |> 
  pivot_wider(names_from = Traveller_Type, values_fill = 0)

# A tibble: 5 x 5
     id Family Nature Culture Other
  <int>  <dbl>  <dbl>   <dbl> <dbl>
1     1      1      1       0     0
2     2      0      1       0     0
3     3      0      1       1     0
4     4      0      1       0     1
5     5      0      0       1     0

Upvotes: 1

Related Questions