Reputation: 141
I would like to redefine the mean function (to apply it in a tabular() table) for it to omit all NA, NaN and Inf observations for a certain variable. I don't want to delete the whole row (observation) but rather have the mean formular simply calculate the mean for all values that are not NA, NaN, Inf.
Mean.new <- function(x) base::mean(x, na.rm=TRUE)
As far as I know does na.rm=TRUE in the standard mean() only remove NAs, not NaN and Inf.
Therefore, how do I add to the code above the functionality to check for is.finite() (which would exclude all NA, NaN, Inf)?
Thank you and best,
cork
Upvotes: 3
Views: 340
Reputation: 41240
With is.finite
:
mean_new <- function(x) {mean(x[is.finite(x)])}
mean_new(c(NA,Inf,NaN,1,2))
[1] 1.5
Upvotes: 6
Reputation: 76585
Base R defines a default method for the generic mean
, so here is a way that works by defining a method for objects of class "numeric"
.
The example data is taken from Waldi's answer. Unlike in his answer, I negate is.infinite
because is.finite
will also return TRUE
for missing values (NA
) and argument na.rm
will be irrelevant, missing values would always be removed. From the documentation ?is.finite
, my emphasis:
Description
is.finite and is.infinite return a vector of the same length as x, indicating which elements are finite (not infinite and not missing) or infinite.
In this description, the missing values part refers to the finite elements only and is.infinite
expected behavior is to return TRUE
for -Inf/Inf
values but not NA
nor NaN
.
The code then becomes
mean.numeric <- function(x, trim = 0, na.rm = FALSE, ...){
x <- x[!is.infinite(x)]
mean.default(x, trim = trim, na.rm = na.rm, ...)
}
y <- c(NA,Inf,NaN,1,2)
is.finite(y)
#[1] FALSE FALSE FALSE TRUE TRUE
!is.infinite(y)
#[1] TRUE FALSE TRUE TRUE TRUE
mean(y)
#[1] NA
mean(y, na.rm = TRUE)
#[1] 1.5
Upvotes: 3