Reputation: 71
I have survey data for 30+ questions on a 5-point Likert scale (Strongly Agree to Strongly Disagree)
Here's some sample data:
survey_df <- data.frame("ID" = c(1,2,3,4,5),
"Q1" = c("Strongly Agree", "Strongly Agree", "Agree", "Agree", "Neither"),
"Q2" = c("Agree", "Strongly Disagree", "Disagree", "Agree", "Neither"),
"Q3" = c("Neither", "Neither", "Disagree", "Agree", "Neither"))
So basically I want to go from this:
ID Q1 Q2 Q3
1 Strongly Agree Agree Neither
2 Strongly Agree Strongly Disagree Neither
3 Agree Disagree Disagree
4 Agree Agree Agree
5 Neither Neither Neither
To this:
Question Strongly.Agree Agree Neither Disagree Strongly.Disagree N.Count
Q1 0.4 0.4 0.2 0.0 0.0 5
Q2 0.0 0.4 0.2 0.2 0.2 5
Q3 0.0 0.2 0.6 0.2 0.0 5
Upvotes: 0
Views: 211
Reputation: 2185
You need to use the function gather()
and spread()
from the librairie tidyr in order to transpose your dataframe, and then use mutate()
and mutate_at()
(from dplyr) to calculate the N.Count
and the proportion by answer.
library(tidyverse)
survey_df %>%
gather(-ID, key = 'Question', value = 'Answer') %>%
count(Question, Answer) %>%
spread(key = Answer, value = n, fill = 0) %>%
mutate(N.Count = Agree + Disagree + Neither + `Strongly Agree` + `Strongly Disagree`) %>%
mutate_at(vars(-Question, -N.Count), funs(. / N.Count))
# Question Agree Disagree Neither `Strongly Agree` `Strongly Disagree` N.Count
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 Q1 0.4 0 0.2 0.4 0 5
# 2 Q2 0.4 0.2 0.2 0 0.2 5
# 3 Q3 0.2 0.2 0.6 0 0 5
Upvotes: 2