Reputation: 6215
I would like to make a ggplot scatterplot that has a colored background where the color at each point is dictated by the formula color = x * y
. On top of this I would plot a bunch of points.
The purpose of the background is to allow the reader to quickly identify which points are "equivalent" because x*y is approximately the same value. I guess this would be accomplished with geom_raster and/or stat_function but I can't quite figure out how to string together the functions. Any insight/tips would be useful and I'll post the final solution.
Here's some skeleton code so you don't have to write an example.
library("ggplot2")
NRPercent <- function(x) {
paste0(sapply(x * 100, scales::comma), "%")
}
data = data.frame( count = c( 5e6,5e6,1e6,1e6, ## lots of experiments
5e6,5e6,5e6, #RS22
5e6,5e6,5e6,5e6,5e6, #RS30
5e6,5e6,5e6,5e6, #RS30
5e6,5e6,5e6,5e6,5e6, #RS30
5e6,5e6,5e6,5e6,5e6, #RS30
5e6,5e6,5e6,5e6, #RS30
1e6,1e6,1e6,1e6,1e6, #RS31
5e5,5e5,5e5,5e5,5e5, #RS31
1e5,1e5,1e5,1e5,1e5, #RS31
5e4,5e4,5e4,5e4,5e4 #RS31
),
percent = c( 1,1,1,1,
0.13,0.475,0.83,
0.1,0.1,0.1,0.1,0.1, #RS30
0.01,0.01,0.01,0.01, #RS30
0.001,0.001,0.001,0.001,0.001, #RS30
0.0001,0.0001,0.0001,0.0001,0.0001, #RS30
0.00001,0.00001,0.00001,0.00001, #RS30
0.01,0.01,0.01,0.01,0.01,
0.01,0.01,0.01,0.01,0.01,
0.01,0.01,0.01,0.01,0.01,
0.01,0.01,0.01,0.01,0.01
),
label = c( "On","On","On","On",
"On","On","On",
"Not On","On","On","On","On",
"Not On","On","On","On",
"Not On","Not On","Not On","Not On","Not On",
"Not On","Not On","Not On","Not On","Not On",
"Not On","Not On","Not On","Not On",
"Unknown","Unknown","Unknown","Unknown","Unknown",
"Unknown","Unknown","Unknown","Unknown","Unknown",
"Unknown","Unknown","Unknown","Unknown","Unknown",
"Unknown","Unknown","Unknown","Unknown","Unknown"
))
g = ggplot(data, aes(x=percent, y=count,color=label)) +
geom_jitter(shape=16,width=0.2, height=0.1) +
scale_y_continuous(trans='log1p',limits=c(40000,10000000),breaks=c(10e6,5e6,1e6,5e5,1e5,5e4,1e4)) +
scale_x_continuous(trans='log',labels = NRPercent, expand=c(0,0), breaks=c(0,0.00001,0.0001,0.001,0.01,0.1,0.5)) +
xlab("Percent")+
ylab("Number") +
theme_bw()
pdf("example_percent_vs_number.pdf")
print(g)
dev.off()
Upvotes: 2
Views: 280
Reputation: 13581
You can try geom_raster
like this. I used log10(color*percent)
to fill
ggplot(data, aes(x=percent, y=count,color=label)) +
geom_jitter(shape=16,width=0.2, height=0.1) +
geom_raster(aes(fill=log10(count*percent))) +
scale_y_continuous(trans='log1p',limits=c(40000,10000000),breaks=c(10e6,5e6,1e6,5e5,1e5,5e4,1e4)) +
scale_x_continuous(trans='log',labels = NRPercent, expand=c(0,0), breaks=c(0,0.00001,0.0001,0.001,0.01,0.1,0.5)) +
xlab("Percent")+
ylab("Number") +
theme_bw()
or geom_tile
ggplot(data, aes(x=percent, y=count,color=label)) +
geom_jitter(shape=16,width=0.2, height=0.1) +
geom_tile(aes(fill=log10(count*percent), x=percent, y=count)) +
scale_y_continuous(trans='log1p',limits=c(40000,10000000),breaks=c(10e6,5e6,1e6,5e5,1e5,5e4,1e4)) +
scale_x_continuous(trans='log',labels = NRPercent, expand=c(0,0), breaks=c(0,0.00001,0.0001,0.001,0.01,0.1,0.5)) +
xlab("Percent")+
ylab("Number") +
theme_bw()
You'll need to adjust the width, height, and color scale to your liking (I'd do it but you're using funny axes). See the example below to show adjusting the size is trivial on normal axes
ggplot(mtcars, aes(x=cyl,y=mpg)) +
geom_tile(aes(fill=cyl*mpg, x=cyl, y=mpg, width=0.5, height=1)) +
geom_point()
How to fill background
Conceptually you need to fill in every point on your plot with a value
X <- seq(min(range(mtcars$cyl)), max(range(mtcars$cyl)), 0.1)
Y <- seq(min(range(mtcars$mpg)), max(range(mtcars$mpg)), 0.1)
SpecDens <- expand.grid(X,Y) %>%
setNames(c("X","Y")) %>%
mutate(D=X*Y)
ggplot(SpecDens, aes(X,Y)) + geom_raster(aes(fill=D))
Again this is difficult with your plot since it spans orders-of-magnitude but the above should get you started
Also, you'll need to merge the background-density values with the actual data-points into a single data.frame to plot both.
Upvotes: 1