Reputation: 1
My problem is similar to this question. But the solution didn't work for me.
I asked tourists what kind of traveler types they are. It was a multiple-choice question with 6 different categories (Nature, Culture, Health, Active, Family, Other) and they were able to choose more than one type. In Google Sheets and Excel, the answers within one column are separated by commas. So I ended up with a variable that looks kind of like this:
Traveller Type |
---|
Family, Nature |
Nature |
Culture, Nature |
Nature, Other |
Culture |
How do I separate the traveller types into new variables of my big table in RStudio? It should look like this:
Traveller Type | Family | Nature | Culture | Health | Active | Other |
---|---|---|---|---|---|---|
Family, Nature | 1 | 1 | 0 | 0 | 0 | 0 |
Nature | 0 | 1 | 0 | 0 | 0 | 0 |
Culture, Nature | 0 | 1 | 1 | 0 | 0 | 0 |
Nature, Other | 0 | 1 | 0 | 0 | 0 | 1 |
Culture, Active | 0 | 0 | 1 | 0 | 1 | 0 |
I've tried this, but the function "mutate_at" didn't work, and also I didn't really understand how to insert my data.
Upvotes: 0
Views: 218
Reputation: 15318
In google sheets you can try this
=sort(query(
arrayformula(split(flatten(
arrayformula(trim(iferror(split(A2:A,",")))) &","&
transpose(arrayformula(substitute(query(transpose(B2:G),,9^9)," ",","))) ),",",true,false)),
"select Col1,sum(Col"&textjoin("),sum(Col",,sequence(1,columns(B1:G1),2,1))&")
where Col1 is not null group by Col1
label "&textjoin("', ",,arrayformula("sum(Col"&Column(B1:G1)&") '"&B1:G1))&"' "))
Upvotes: 0
Reputation: 8107
df <- data.frame(Traveller_Type =
c("Family, Nature",
"Nature",
"Culture, Nature",
"Nature, Other",
"Culture"))
library(dplyr)
library(tidyr)
df |>
mutate(id = row_number(), value = 1) |>
separate_rows(Traveller_Type) |>
pivot_wider(names_from = Traveller_Type, values_fill = 0)
# A tibble: 5 x 5
id Family Nature Culture Other
<int> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 0 0
2 2 0 1 0 0
3 3 0 1 1 0
4 4 0 1 0 1
5 5 0 0 1 0
Upvotes: 1