Aaron
Aaron

Reputation: 181

Trouble constructing a function properly in R

In the code below, I'm trying to find the mean correct score for each item in the "category" column of the "regular season" dataset I'm working with.

rs_category <- list2env(split(regular_season, regular_season$category), 
        .GlobalEnv)
unique_categories <- unique(regular_season$category)

for (i in unique_categories)
  Mean_[i] <- mean(regular_season$correct[regular_season$category == i], na.rm = TRUE, .groups = 'drop')
  eapply(rs_category, Mean_[i])
print(i)

I'm having trouble getting this to work though. I have created a list of the items in the category as sub-datasets and separately, (I think) I have created a vector of the unique items in the category in order to run the for loop with. I have a feeling the problem may be with how I defined the mean function because an error occurs at the "eapply()" line and tells me "Mean_[i]" is not a function, but I can't think of how else to define the function. If someone could help, I would greatly appreciate it.

Upvotes: 1

Views: 33

Answers (2)

akrun
akrun

Reputation: 887148

The issue would be that Mean_ wouldn't have an i name. In the below code, we initiaize the object 'Mean_' as type numeric with length as the same as length of 'unique_categories', then loop over the sequence of 'unique_categories', get the subset of 'correct', apply the mean function and store that as ith value of 'Mean_'

Mean_ <- numeric(length(unique_categories))
for(i in seq_along(unique_categories)) {
      Mean_[i] <- mean(regular_season$correct[regular_season$category 
                           == unique_categories[i]], na.rm = TRUE)
   }

If we need to use a faster execution, use data.table

library(data.table)
setDT(regular_season[,  .(Mean_ = mean(correct, na.rm = TRUE)), category]

Or using collapse

library(collapse)
fmean(slt(regular_season, category, correct), g = category)

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388982

Instead of splitting the dataset and using for loop R has functions for such grouping operations which I think can be used here. You can apply a function for each unique group (value).

library(dplyr)

regular_season %>%
  group_by(category) %>%
  summarise(Mean_ = mean(correct, na.rm = TRUE)) -> result

This gives you average value of correct for each category, where result$Mean_ is the vector that you are looking for.

In base R, this can be solved with aggregate.

result <- aggregate(correct~category, regular_season, mean, na.rm = TRUE)

Upvotes: 1

Related Questions