Reputation: 79
on my survey I made a mistake for a 5 point likert scale as follows:
dput(head(edu_data))
structure(list(Education.1. = structure(c(1L, 1L, 1L, 1L, 1L,
1L), .Label = c("", "Y"), class = "factor"), Education.2. = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor"),
Education.3. = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("",
"Y"), class = "factor"), Education.4. = structure(c(1L, 1L,
1L, 2L, 2L, 1L), .Label = c("", "Y"), class = "factor"),
Education.5. = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("",
"Y"), class = "factor")), row.names = c(NA, 6L), class = "data.frame")
I would like to change this into one column with a single value such that answer_to_ls= 1:5
The output I want to get would be a column with a single number and that means getting rid of the letter. I do off course have a unique respondent's ID
Please tell me if I can somehow be more clear in the style of my question as I want to be a valuable member of the comunity.
Upvotes: 1
Views: 221
Reputation: 30549
I think there are a lot of potential solutions available, try a search of merging or collapsing multiple binary or dichotomous columns into a single column. For example:
In your case, you could try something like:
edu_data$answer_to_ls <- apply(edu_data[1:5] == "Y", 1, function(x) { if (any(x)) { as.numeric(gsub(".*(\\d+).", "\\1", names(which(x)))) } else NA })
This will extract the number from the column name for the Likert scale response 1 to 5, make it a numeric value, and include NA if there are no "Y" responses. edu_data[1:5]
selects those columns to consider for conversion, in this case columns 1 through 5.
Education.1. Education.2. Education.3. Education.4. Education.5. answer_to_ls
1 Y 5
2 Y 5
3 Y 5
4 Y 4
5 Y 4
6 NA
Upvotes: 1
Reputation: 4427
d <- structure(list(Education.1. = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor"),
Education.2. = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor"),
Education.3. = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor"),
Education.4. = structure(c(1L, 1L, 1L, 2L, 2L, 1L), .Label = c("", "Y"), class = "factor"),
Education.5. = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor")),
row.names = c(NA, 6L), class = "data.frame")
d$item1 <- 1 * (d$Education.1 == "Y") +
2 * (d$Education.2 == "Y") +
3 * (d$Education.3 == "Y") +
4 * (d$Education.4 == "Y") +
5 * (d$Education.5 == "Y")
print(d)
leads to
> print(d)
Education.1. Education.2. Education.3. Education.4. Education.5. item1
1 Y 5
2 Y 5
3 Y 5
4 Y 4
5 Y 4
6 0
Upvotes: 0