Reputation: 181
In the code below, I'm trying to find the mean correct score for each item in the "category" column of the "regular season" dataset I'm working with.
rs_category <- list2env(split(regular_season, regular_season$category),
.GlobalEnv)
unique_categories <- unique(regular_season$category)
for (i in unique_categories)
Mean_[i] <- mean(regular_season$correct[regular_season$category == i], na.rm = TRUE, .groups = 'drop')
eapply(rs_category, Mean_[i])
print(i)
I'm having trouble getting this to work though. I have created a list of the items in the category as sub-datasets and separately, (I think) I have created a vector of the unique items in the category in order to run the for loop with. I have a feeling the problem may be with how I defined the mean function because an error occurs at the "eapply()" line and tells me "Mean_[i]" is not a function, but I can't think of how else to define the function. If someone could help, I would greatly appreciate it.
Upvotes: 1
Views: 33
Reputation: 887148
The issue would be that Mean_
wouldn't have an i
name. In the below code, we initiaize the object 'Mean_' as type numeric
with length
as the same as length of 'unique_categories', then loop over the sequence of 'unique_categories', get the subset of 'correct', apply the mean
function and store that as i
th value of 'Mean_'
Mean_ <- numeric(length(unique_categories))
for(i in seq_along(unique_categories)) {
Mean_[i] <- mean(regular_season$correct[regular_season$category
== unique_categories[i]], na.rm = TRUE)
}
If we need to use a faster execution, use data.table
library(data.table)
setDT(regular_season[, .(Mean_ = mean(correct, na.rm = TRUE)), category]
Or using collapse
library(collapse)
fmean(slt(regular_season, category, correct), g = category)
Upvotes: 1
Reputation: 388982
Instead of splitting the dataset and using for
loop R has functions for such grouping operations which I think can be used here. You can apply a function for each unique group (value).
library(dplyr)
regular_season %>%
group_by(category) %>%
summarise(Mean_ = mean(correct, na.rm = TRUE)) -> result
This gives you average value of correct
for each category
, where result$Mean_
is the vector that you are looking for.
In base R, this can be solved with aggregate
.
result <- aggregate(correct~category, regular_season, mean, na.rm = TRUE)
Upvotes: 1