Daniel Ortiz
Daniel Ortiz

Reputation: 79

Melting and converting badly labeled likert Scale R

on my survey I made a mistake for a 5 point likert scale as follows:

dput(head(edu_data))
structure(list(Education.1. = structure(c(1L, 1L, 1L, 1L, 1L, 
1L), .Label = c("", "Y"), class = "factor"), Education.2. = structure(c(1L, 
1L, 1L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor"), 
Education.3. = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", 
"Y"), class = "factor"), Education.4. = structure(c(1L, 1L, 
1L, 2L, 2L, 1L), .Label = c("", "Y"), class = "factor"), 
Education.5. = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("", 
"Y"), class = "factor")), row.names = c(NA, 6L), class = "data.frame")

I would like to change this into one column with a single value such that answer_to_ls= 1:5

The output I want to get would be a column with a single number and that means getting rid of the letter. I do off course have a unique respondent's ID

Please tell me if I can somehow be more clear in the style of my question as I want to be a valuable member of the comunity.

Upvotes: 1

Views: 221

Answers (2)

Ben
Ben

Reputation: 30549

I think there are a lot of potential solutions available, try a search of merging or collapsing multiple binary or dichotomous columns into a single column. For example:

R - Convert various dummy/logical variables into a single categorical variable/factor from their name

In your case, you could try something like:

edu_data$answer_to_ls <- apply(edu_data[1:5] == "Y", 1, function(x) { if (any(x)) { as.numeric(gsub(".*(\\d+).", "\\1", names(which(x)))) } else NA })

This will extract the number from the column name for the Likert scale response 1 to 5, make it a numeric value, and include NA if there are no "Y" responses. edu_data[1:5] selects those columns to consider for conversion, in this case columns 1 through 5.

  Education.1. Education.2. Education.3. Education.4. Education.5. answer_to_ls
1                                                                Y            5
2                                                                Y            5
3                                                                Y            5
4                                                   Y                         4
5                                                   Y                         4
6                                                                            NA

Upvotes: 1

Bernhard
Bernhard

Reputation: 4427

d <- structure(list(Education.1. = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor"), 
               Education.2. = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor"),
               Education.3. = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor"), 
               Education.4. = structure(c(1L, 1L, 1L, 2L, 2L, 1L), .Label = c("", "Y"), class = "factor"), 
               Education.5. = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("", "Y"), class = "factor")), 
               row.names = c(NA, 6L), class = "data.frame")

d$item1 <- 1 * (d$Education.1 == "Y") +
           2 * (d$Education.2 == "Y") +
           3 * (d$Education.3 == "Y") +
           4 * (d$Education.4 == "Y") +
           5 * (d$Education.5 == "Y") 

print(d)

leads to

> print(d)
  Education.1. Education.2. Education.3. Education.4. Education.5. item1
1                                                                Y     5
2                                                                Y     5
3                                                                Y     5
4                                                   Y                  4
5                                                   Y                  4
6                                                                      0

Upvotes: 0

Related Questions