Reputation: 11859
I want to plot foo ~ bar
. However, I don't want to look at the exact data, I'd rather break bar
into say quantiles, and plot mean(foo)
for every quantile (so my final plot will have 5 data points). Is this possible?
Upvotes: 1
Views: 2047
Reputation: 263481
foo <- rnorm(100)
bar <- rnorm(100)
mn.foo.byQ10bar <- tapply(foo, cut(bar, quantile(bar, (0:5)/5, na.rm=TRUE)), mean)
> mn.foo.byQ5bar
(-3.31,-0.972] (-0.972,-0.343] (-0.343,0.317] (0.317,0.792] (0.792,2.71]
0.13977839 0.03281258 -0.18243804 -0.14242885 -0.01696712
plot(mn.foo.byQ5bar)
This is a fairly standard task and Harrell's Hmisc package's cut2
function has a nice gr= argument that lets you do this by just specifying an integer for the number of groups. I also like it because the intervals from the cut operation are left-closed instead of R default for right-closed.
Upvotes: 6
Reputation: 55420
You can combine a lot of these lines into more concise code, but here it is broken down
# Sample Data:
x <- 1:100; y <- rnorm(x)
# Number Of Groups
N <- 5
# quantiles
Q.y <- quantile(y, probs=seq(0, 1, length=(N+1)))
Q.x <- quantile(x, probs=seq(0, 1, length=N))
# means of y by quantile
means.y <- c(by(y, cut(y, Q.y), mean))
# plot them
qplot(Q.x, means.y)
Upvotes: 5