Reputation: 1078
I have a data frame with likert scoring across multiple aspects of a course (about 40 columns of likert scores like the two in the sample data below).
Not all rows contain valid scores. Valid scores are 1:5. Invalid scores are allocated 96:99 or are simply missing.
I would like to create an average score for each individual ID for each of the satisfaction columns that:
1) filters for invalid scores,
2) creates a mean of the valid scores for each id .
3) places the mean satisfaction score for each id in a new column labelled [column.name].mean as in Skill.satisfaction.mean below
I have included a sample data frame and the transformation of the data frame that I would like on a single row below.
####sample score vector
possible.scores <-c(1:5, 96,97, 99,"")
####data frame
ratings <- data.frame(ID = c(rep(1:7, each =2), 8:10), Degree = c(rep("Double", times = 14), rep("Single", times = 3)),
Skill.satisfaction = sample(possible.scores, size = 17, replace = TRUE),
Social.satisfaction = sample(possible.scores, size = 17, replace = TRUE)
)
####transformation applied over one of the satisfaction scales
ratings<- ratings %>%
group_by(ID) %>%
filter(!Skill.satisfaction %in% c(96:99), Skill.satisfaction!="") %>%
mutate(Skill.satisfaction.mean = mean(as.numeric(Skill.satisfaction), na.rm = T))
Upvotes: 1
Views: 166
Reputation: 13125
library(dplyr)
ratings %>%
group_by(ID) %>%
#Change satisfaction columns from factor into numeric
mutate_at(vars(-ID,-Degree), list(~as.numeric(as.character(.)))) %>%
#Get mean for values in 1:5
mutate_at(vars(-ID,-Degree), list(mean=~mean(.[. %in% 1:5], na.rm = T)))
# A tibble: 6 x 6
# Groups: ID [3]
ID Degree Skill.satisfaction Social.satisfaction Skill.satisfaction_mean Social.satisfaction_mean
<int> <fct> <dbl> <dbl> <dbl> <dbl>
1 1 Double 96 99 2 NaN
2 1 Double 2 97 2 NaN
3 2 Double 1 97 1 NaN
4 2 Double 97 NA 1 NaN
5 3 Double 96 96 NaN 3
6 3 Double 99 3 NaN 3
Upvotes: 1