SecretIndividual
SecretIndividual

Reputation: 2519

Adding column based on data in other data frame

I would like to calculate the average exam score of each student and add this as a new column to a data frame:

library(dplyr)

my_students <- c("John", "Lisa", "Sam")
student_exam <- c("John", "Lisa", "John", "John")
score_exam <- c(7, 6, 7, 6)

students <- as.data.frame(my_students)
scores <- as.data.frame(student_exam)
scores <- cbind(scores, score_exam)

new_frame <- students %>% mutate(avg_score = (scores %>% filter(student_exam == my_students) %>% mean(score_exam)))

But the code above gives the following error:

Error in Ops.factor(student_examn, my_students) : 
  level sets of factors are different

I assume it has to do with filter(student_exam == my_students). How would I do this in dplyr?

Upvotes: 0

Views: 52

Answers (1)

mcz
mcz

Reputation: 587

You need to make sure you define two data frames with matching column named "name". You can then use group_by and summarize to group scores by student and summarize the average for each student. This solution has a warning that is telling you that you should be aware that not every student in your class has an average exam score. As a result, Sam's average score is NA.

library(dplyr)

my_students <- c("John", "Lisa", "Sam")
student_exam <- c("John", "Lisa", "John", "John")
score_exam <- c(7, 6, 7, 6)

students <- data.frame("name" = as.character(my_students))
scores <- data.frame("name" = as.character(student_exam), "score" = score_exam)


avg_scores <- scores %>%
  group_by(name) %>%
  summarize(avgScore = mean(score)) %>%
  right_join(students)

Upvotes: 2

Related Questions