Reputation: 831
I have a data frame with survey results that looks like this:
Q1 Q2 Q3
1 Agree No opinion Disagree
2 No opinion No opinion Disagree
3 Agree Disagree
How can I convert the survey responses into numbers so that I can get the mean response for each question? I can use gsub to substitute numeric values for each text answer in each column, but there must be a better way.
> str(x)
'data.frame': 3 obs. of 3 variables:
$ Q1: Factor w/ 2 levels "Agree","No opinion": 1 2 1
$ Q2: Factor w/ 2 levels "","No opinion": 2 2 1
$ Q3: Factor w/ 1 level "Disagree": 1 1 1
Upvotes: 0
Views: 2304
Reputation: 59970
I must be misunderstanding what you want, but since you have categorical variables in a data.frame
can't you just use summary
on it?
#Example
q1 <- sample( c("Agree" , "No opinion" ) , 10 , replace = TRUE )
q2 <- sample( c(" " , "No opinion" ) , 10 , replace = TRUE )
q3 <- sample( c("Agree" , "Disagree" ) , 10 , replace = TRUE )
x <- data.frame( q1 , q2 , q3 )
summary(x)
q1 q2 q3
Agree :4 , :4 Agree :5
No opinion:6 No opinion:6 Disagree:5
Upvotes: 0
Reputation: 3965
OK, it is clear now.
I would convert each column to character, then to factor (with common levels), then to integer:
sapply(data, function(x) as.integer(factor(as.character(x), levels=c("Agree", "No opinion", "Disagree"))))
Upvotes: 5