Reputation: 10619
In a plot(x,y) is there any way to plot a line/curve/function that would split **at every x (see DWins comment) ** the observations in 2 halfs? So that **at arround every x (see DWins comment) ** the same number of observations are above and below this line/curve/function? Is there any way to get the (x,y) coordinates or the function of this line/curve/function?
As regressing the data is problematic due to outliers/non-normality etc etc, i though a programming method might provide a viable solution without resorting to complicated regression methods.
Thanks a lot
Upvotes: 1
Views: 247
Reputation: 269451
First generate some test data:
x <- c(1, 1, 1, 2, 2, 3, 3, 3, 3)
y <- seq_along(x)
Now assuming the data is sorted by x
calculate the median at each x
and plot:
plot(y ~ x)
m <- tapply(y, x, median)
lines(m ~ unique(x))
Upvotes: 4
Reputation: 263301
Implementing Bolker's idea is really quite simple. This is just plotting the results of the first example in package quantreg's rq
function
require(quantreg)
data(stackloss); fit <- rq(stack.loss ~ Air.Flow, .5, data=stackloss)
with(stackloss, plot(Air.Flow, stack.loss))
abline(a=coef(fit)[1], b=coef(fit)[2])
However that is not an "at every point" solution, so consider this loess
approach:
fit <-loess(stack.loss ~ Air.Flow, data=stackloss, family="symmetric")
plot(stack.loss ~ Air.Flow, data=stackloss)
with(stackloss, lines(sort(unique(Air.Flow)),
predict(fit, data.frame(Air.Flow=sort(unique(Air.Flow))))))
It doesn't do well at the x vlaues where there is only one value but it seems to hit pretty close to the median when using the family="symmetric" option.
Upvotes: 2