Reputation: 2519
I would like to calculate the average exam score of each student and add this as a new column to a data frame:
library(dplyr)
my_students <- c("John", "Lisa", "Sam")
student_exam <- c("John", "Lisa", "John", "John")
score_exam <- c(7, 6, 7, 6)
students <- as.data.frame(my_students)
scores <- as.data.frame(student_exam)
scores <- cbind(scores, score_exam)
new_frame <- students %>% mutate(avg_score = (scores %>% filter(student_exam == my_students) %>% mean(score_exam)))
But the code above gives the following error:
Error in Ops.factor(student_examn, my_students) :
level sets of factors are different
I assume it has to do with filter(student_exam == my_students)
. How would I do this in dplyr?
Upvotes: 0
Views: 52
Reputation: 587
You need to make sure you define two data frames with matching column named "name". You can then use group_by and summarize to group scores by student and summarize the average for each student. This solution has a warning that is telling you that you should be aware that not every student in your class has an average exam score. As a result, Sam's average score is NA.
library(dplyr)
my_students <- c("John", "Lisa", "Sam")
student_exam <- c("John", "Lisa", "John", "John")
score_exam <- c(7, 6, 7, 6)
students <- data.frame("name" = as.character(my_students))
scores <- data.frame("name" = as.character(student_exam), "score" = score_exam)
avg_scores <- scores %>%
group_by(name) %>%
summarize(avgScore = mean(score)) %>%
right_join(students)
Upvotes: 2