Reputation: 1087
I have some qualitative data that I have coded into various categories and I want to provide summaries for subgroups. The RQDA package is great for coding interviews but I've struggled with creating summaries for open ended survey responses. I've managed to export the coded file into HTML, and copy/paste into Excel. I now have 500 lines with all the categories in distinct columns however the same code may appear in different columns. For example, some data:
a <- c("ResponseA", "ResponseB", "ResponseC", "ResponseD", "NA")
b <- c("ResponseD", "ResponseC", "NA", "NA","NA")
c <- c("ResponseB", "ResponseA", "ResponseE", "NA", "NA")
d <- c("ResponseC", "ResponseB", "ResponseA", "NA", "NA")
df <- data.frame (a,b,c,d)
I'd like to be able to run something like
df$ResponseA <- recode (df$a | df$b | df$c, "
'ResponseA' = '1';
else='0' ")
df$ResponseB <- recode (df$a | df$b | df$c, "
'ResponseB' = '1';
else='0' ")
In short, I'd like scan 9 columns and recode into a single binary variable.
Upvotes: 0
Views: 404
Reputation: 193687
If I understand the question correctly, perhaps you can try something like this:
## Convert your data into a long format first
dfL <- cbind(id = sequence(nrow(df)), stack(lapply(df, as.character)))
## The next three lines are mostly cleanup
dfL$id <- factor(dfL$id, sequence(nrow(df)))
dfL$values[dfL$values == "NA"] <- NA
dfL <- dfL[complete.cases(dfL), ]
## `table` is the real workhorse here
cbind(df, (table(dfL[1:2]) > 0) * 1)
# a b c d ResponseA ResponseB ResponseC ResponseD ResponseE
# 1 ResponseA ResponseD ResponseB ResponseC 1 1 1 1 0
# 2 ResponseB ResponseC ResponseA ResponseB 1 1 1 0 0
# 3 ResponseC NA ResponseE ResponseA 1 0 1 0 1
# 4 ResponseD NA NA NA 0 0 0 1 0
# 5 NA NA NA NA 0 0 0 0 0
You can also try the following:
(table(rep(1:nrow(df), ncol(df)), unlist(df)) > 0) * 1L
#
# NA ResponseA ResponseB ResponseC ResponseD ResponseE
# 1 0 1 1 1 1 0
# 2 0 1 1 1 0 0
# 3 1 1 0 1 0 1
# 4 1 0 0 0 1 0
# 5 1 0 0 0 0 0
Upvotes: 1