Reputation: 813
The goal is to generate a "histogram" of x
where the bars are sum(y)/count(x)
, where y
is another variable describing the data. The point is to use ggplot
binning to do the grouping part. I do not want to calculate the binning myself and then perform the calculation.
example:
library(ggplot2)
library(data.table)
k <- runif(1000)
k <- k[order(k)]
y <- c(rbinom(n = 500, size = 1, prob = .05), rbinom(n = 500, size = 1, prob = .95))
w <- data.table(k, y)
so a plot(w$k, w$y)
gives
so theoretically what I am looking for looks like this:
ggplot(w, aes(k)) + geom_histogram(aes(y = stat(sum(y)/count)))
but it generates this:
Upvotes: 0
Views: 216
Reputation: 6483
Not sure if this is what you want but sum(y) is going to be the same for all bars.
library(ggplot2)
library(data.table)
set.seed(13434)
k <- runif(1000)
k <- k[order(k)]
y <- c(rbinom(n = 500, size = 1, prob = .05), rbinom(n = 500, size = 1, prob = .95))
w <- data.table(k, y)
constant_value <- sum(w$y)
ggplot(w, aes(k)) + geom_histogram(aes(y = stat(constant_value/count)))
gives exactly the same plot as
ggplot(w, aes(k)) + geom_histogram(aes(y = stat(sum(w$y)/count)))
Not sure if this helps you, here I use the same binwidth (30) as ggplot2s default:
library(tidyverse)
w %>%
arrange(k) %>%
mutate(bin = cut_interval(1:length(k), length=30, labels=FALSE)) %>%
group_by(bin) %>%
summarise(mean_y = mean(y),
mean_k = mean(k),
width = max(k) - min(k)) %>%
ggplot(aes(mean_k, mean_y, width=width)) +
geom_bar(stat="identity") +
labs(x="k", y="mean y")
which makes this figure:
Upvotes: 1