Reputation: 143
It's been a while since I've had to drop by and ask for advice, but this one has me stumped and my searches for an answer haven't been fruitful (probably because the words I'm using are confusing search engines).
I have a survey data file that has columns that are either set to a 1 (if the respondent clicked the box) or are NA (if they didn't). I want to create a new field for each respondent that summarizes the checked boxes.
Below is a reprex with the starting df and the desire df2 I would like to create.
Any advice is greatly appreciated. Happy Sunday! And thanks in advance.
Original table:
library(tibble)
set.seed(125)
df <- tibble(
id = 1:10,
math = sample(c(rep(1, 3), rep(NA, 7))),
english = sample(c(rep(1, 5), rep(NA, 5))),
science = sample(c(rep(1, 6), rep(NA, 4))),
history = sample(c(rep(1, 8), rep(NA, 2)))
)
Looks like this:
# A tibble: 10 x 5
# id math english science history
# <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 NA 1 NA 1
# 2 2 NA NA 1 1
# 3 3 NA 1 1 1
# 4 4 1 NA 1 1
# 5 5 1 1 NA 1
# 6 6 NA 1 1 1
# 7 7 NA NA NA NA
# 8 8 1 NA 1 1
# 9 9 NA 1 NA 1
#10 10 NA NA 1 NA
Desired table
# A tibble: 10 x 2
# id background
# <int> <chr>
# 1 1 "english, history"
# 2 2 "science, history"
# 3 3 "english, science, history"
# 4 4 "math, science, history"
# 5 5 "math, english, history"
# 6 6 "english, science, history"
# 7 7 ""
# 8 8 "math, science, history"
# 9 9 "english, history"
#10 10 "science"
Created on 2022-01-30 by the reprex package (v2.0.1)
Upvotes: 2
Views: 78
Reputation: 19181
You can use pivot_longer
and filter
with a final paste
to get the desired output
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols=-id) %>%
group_by(id) %>%
mutate(value=replace_na(value, 0),
v_sum=sum(value),
name=ifelse(v_sum==0,"",name)) %>%
filter(value==1|v_sum==0) %>%
summarise(background=paste(unique(name), collapse=", "))
# A tibble: 10 × 2
id background
<int> <chr>
1 1 "english, history"
2 2 "science, history"
3 3 "english, science, history"
4 4 "math, science, history"
5 5 "math, english, history"
6 6 "english, science, history"
7 7 ""
8 8 "math, science, history"
9 9 "english, history"
10 10 "science"
Upvotes: 0
Reputation: 34586
A dplyr
/purrr
approach:
library(dplyr)
library(purrr)
df %>%
transmute(id = id,
background = pmap_chr(across(-id, ~ cur_column()[.x]), ~ toString(na.omit(c(...)))))
# A tibble: 10 x 2
id background
<int> <chr>
1 1 "english, history"
2 2 "science, history"
3 3 "english, science, history"
4 4 "math, science, history"
5 5 "math, english, history"
6 6 "english, science, history"
7 7 ""
8 8 "math, science, history"
9 9 "english, history"
10 10 "science"
Upvotes: 2