ocwa80
ocwa80

Reputation: 11

condtional mean based on cell reference

I'd like to add a column to my data set that is populated with the averages the values of a given category.

Using the iris data set as an example

library(datasets)
head(iris)
unique(iris$Species)
mutate(iris, spec_avg_pet_width=1)

I'd like to replace "1" with the average of virginica, setosa and versicolor corresponding correctly to the Species of that row. I will ultimately be using this average as a way to arrange categories in a ggplot. The excel equivalent of this would be an AverageIfs function.

I know how to do a conditional mean based on an absolute value"

mean(iris[iris$Species =='setosa','Petal.Width'])

But cannot figure out how to do a conditional mean based on a relative, corresponding value.

Essentially I would like my new column to return the values of 0.246, 2.026 or 1.326 depending on whether the row is setosa, virginica or versicolor.

Thank you!

Upvotes: 0

Views: 19

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 173858

This is exactly what the base R function ave does:

ave(iris$Petal.Width, iris$Species)
#>   [1] 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246
#>  [13] 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246
#>  [25] 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246
#>  [37] 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246 0.246
#>  [49] 0.246 0.246 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326
#>  [61] 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326
#>  [73] 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326
#>  [85] 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326 1.326
#>  [97] 1.326 1.326 1.326 1.326 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026
#> [109] 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026
#> [121] 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026
#> [133] 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026 2.026
#> [145] 2.026 2.026 2.026 2.026 2.026 2.026

So, for example, to get your new column you could do:

within(iris, spec_avg_pet_width <- ave(Sepal.Width, Species))

Created on 2022-09-23 with reprex v2.0.2

Upvotes: 0

Related Questions