Reputation: 1193
I have 2 continuous variables (X and Y) that I want to bin into a 2d grid. Associated with every (x,y) pair I have a factor that is either PASS or FAIL. I want to plot in a 2d grid the ratio of PASS/FAIL.
For example, using the iris dataset:
ggplot(iris, aes(x=Sepal.Length , y=Petal.Length)) + geom_bin2d()
plots the total count in each 2d bin - how do I change this to plot the ratio of the count of virginica and versicolor in each bin?
Upvotes: 0
Views: 174
Reputation: 332
By using stat_summary2d()
, data preprocessing (turn binary factor into numeric in dataframe) and use the z argument associated with the stat_summary2d()
function.
iris$tf <- as.numeric(as.logical(round(runif(nrow(iris)))))
ggplot(iris, aes(x=Sepal.Length , y=Petal.Length,z=tf)) +
stat_summary2d(bins = 10,binwidth = c(2)) +
labs(title = "Ratio of T/F of Factor by Petal.Length and Sepal.Length") +
scale_fill_continuous(name = "Ratio")
Note: if you turn your binary factor to a numeric, it will coerce to 1/2 (instead of 0/1) by default, so subtract one off it. If it is a logical, then this won't be necessary.
Edit: added default fun='mean'
argument to stat_summary2d()
to make it clear this is the default behaviour of the function.
Upvotes: 1