Reputation: 249
The dataset I have shows each participant's data for the variables as the text option instead of the numeric value only. For instance, if the answer choices to the variable is:
1) A little
2) Somewhat
3) Not at all
Then someone who chooses the first choice will have their data shown as:
(1) A little
Rather than
1
Where it would be easy to analyze. The dataset source provides an R code to convert the text to numeric values given here:
library(prettyR)
lbls <- sort(levels(data$Variable1))
lbls <- (sub("^\\([0-9]+\\) +(.+$)", "\\1", lbls))
data$Variable1 <- as.numeric(sub("^\\(0*([0-9]+)\\).+$", "\\1", data$Variable1))
data$Variable1 <- add.value.labels(data$Variable1, lbls)
While this works, I've been relegated to doing this one by one for each variable. There are over 400 variables in the dataset and there are multiple datasets to work with. Is there a way to adjust the code so that it changes the text factor to a numeric one for every variable in the dataset rather than having to do it one by one?
Upvotes: 0
Views: 57
Reputation: 66415
Let's say you have this data:
data <- data.frame(stringsAsFactors = F,
responses = c("1) A little", "2) Somewhat", "3) Not at all"),
responses2 = c("2) Somewhat", "1) A little", "3) Not at all"),
responses3 = c("2) Somewhat", "3) Not at all", "1) A little"))
Here's an alternative to your regex method:
readr::parse_number(data$responses)
Here's one way to apply that to all columns:
library(dplyr)
data %>%
mutate_all(parse_number)
responses responses2 responses3
1 1 2 2
2 2 1 3
3 3 3 1
Upvotes: 1