Reputation: 163
This may seem trivial, but I have a code that finds the average of the closest two numbers in a set of three numbers. So 5 examples:
x1 <- c(1,5,7)
x2 <- c(NA,2,3)
x3 <- c(2,6,4)
x4 <- c(1,NA,NA)
x5 <- c(1,3,1)
I would get an output of
y1 = 6
y2 = 2.5
y3 = 4
y4 = 1
y5 = 1
respectively. Basically, finding the closest 2 numbers, then averaging them, accounting for NA and ties.
This code is a monstrosity:
x <-x[!is.na(x)]
x <-x[order(x)]
y <-ifelse(length(x) == 1, x,
ifelse(length(x) == 2, mean(x),
ifelse(length(x) == 3,
ifelse(abs(x[1] - x[2]) == abs(x[2] - x[3]), mean(x),
ifelse(abs(x[1] - x[2]) > abs(x[2] - x[3]), mean(x[2:3]),
ifelse(abs(x[1] - x[2]) < abs(x[2] - x[3]), mean(x[1:2]),
"error"))), NA)))
It works, but because this is part of a larger for
loop, I was wondering there's a better way of doing this..
Upvotes: 2
Views: 358
Reputation: 269905
We define an S3 generic with "list"
and "default"
methods.
The "default"
method takes a vector and sort it (which also removes NA values) and then if the length of what is left is <= 1 it returns the single value or NA if none. If the length is 2 or the two successive differences are the same then it returns the mean of all values; otherwise, it finds the index of the first of the pair of the closest two values and returns the mean of the values.
The "list"
method applies the default method repeatedly.
mean_min_diff <- function(x) UseMethod("mean_min_diff")
mean_min_diff.list <- function(x) sapply(x, mean_min_diff.default)
mean_min_diff.default <- function(x) {
x0 <- sort(x)
if (length(x0) <= 1) c(x0, NA)[1]
else if (length(x0) == 2 || sd(diff(x0)) == 0) mean(x0)
else mean(x0[seq(which.min(diff(x0)), length = 2)])
}
Now test it out:
mean_min_diff(x1)
## [1] 6
mean_min_diff(list(x1, x2, x3, x4, x5))
## [1] 6.0 2.5 4.0 1.0 1.0
Upvotes: 4