Reputation: 323
I'm trying to create a plot where I can fill the area only in the points where my coefficients are significant using ggplot2.
I have created this example:
dt <- data.table(x = 0:23, y = c(0.00788665622373638, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0.031263597681424, 0.0483478996438207,
0.0339161353262161, 0, 0, 0, 0, 0, 0, 0, 0), value = c(0.335524374372203,
0.310445022036626, 0.00348268861151579, 0.000645923627809575,
0.0025476114971974, 0.000979901982654185, 0.00447235816030944,
0.000375791689380511, 0.00850170357523439, 0.185246478252772,
0.236061996429638, 0.611479957550591, 0.916055517054685, 0.047195113633542,
0.00170024647583689, 0.0138696238231373, 0.700687775315984, 0.0562079029293676,
0.00527934454203627, 0.00870851100765857, 0.005848832805464,
0.00300379176492194, 0.00400049813928849, 0.323674152828656))
And using the following code:
plt <- ggplot(dt,aes(x=x,y=y)) + geom_line(colour='blue') + geom_point() + geom_area(data=subset(dt,value<0.1 & y > 0),fill='skyblue',alpha=0.3)
I get this graph:
It seems that is connecting the points where value is under 0.1 and I only want to color the area under the line where value
is under 0.1.
Is there any way around this?
Upvotes: 0
Views: 1526
Reputation: 19716
I have been attempting to provide a function that will transform the data so it could be plotted as per request, and doing so I have found a potential problem in the idea.
Consider a point x where y is positive and value is < 0.1, while x-1 and x+1 have values > 0.1. With geom_area this point would be left out since the area of a line is 0. Hence I believe several other visualizations could be more beneficial:
geom_linerange or geom_pointrange are potentially better (and much easier to plot), here is an example with your data. It emphasizes the points where value < 0.1 and y > 0.
ggplot(dt,aes(x=x,y=y)) +
geom_line(colour='blue') +
geom_point() +
geom_linerange(data = dt[dt$value < 0,1,], aes(ymin = 0, ymax = y), color= "skyblue", size = 1)
geom_point to emphasize the points where value < 0.1
ggplot(dt,aes(x=x,y=y)) +
geom_line(colour='blue') +
geom_point() +
geom_point(data = dt[dt$value < 0.1,], color= "red", size = 2)
If you are really set on using geom_area here is a function (only base R):
for_area = function(data, val){
df = data
v = ifelse(df$value >= val, 0, df$value)
y = ifelse(df$value >= val, 0, df$y)
df$value = v
df$y = y
pre = lapply(2:nrow(df), function(i){
pre = ifelse(df$y[i-1] == 0 & df$y[i] !=0, i, 0)
return(pre)
})
pro = lapply(1:nrow(df), function(i){
pro = ifelse(df$y[i] != 0 & df$y[i+1] ==0, i, 0)
return(pro)
})
pre = do.call(rbind, pre)
pro = do.call(rbind, pro)
pre = pre[pre>0]
pro = pro[pro>0]
pre = df$x[pre]
pro = df$x[pro]
df$x1 = 1
df = rbind(df, data.frame(x = pre,
y = rep(0, length(pre)),
value = rep(0, length(pre)),
x1 = rep(0, length(pre))))
df = rbind(df, data.frame(x = pro,
y = rep(0, length(pro)),
value = rep(0, length(pro)),
x1 = rep(2, length(pro))))
df = df[with(df, order(x, x1)),]
return(df)
}
with the data in the op:
ggplot(dt,aes(x=x,y=y)) +
geom_line(colour='blue') +
geom_point() +
geom_area(data = for_area(dt, 0.1), fill= "skyblue", alpha = 0.3)
with a more complicated example:
dput(daf)
structure(list(x = 1:25, y = c(0.3, 0.2, 0.2, 0, 0.1, 0.1, 0.3,
0.2, 0.3, 0.1, 0, 0.3, 0.2, 0.1, 0.3, 0, 0.2, 0.3, 0, 0.1, 0.1,
0.2, 0.3, 0, 0.3), value = c(0, 0.3, 0, 0, 0, 0.2, 0.3, 0.2,
0.2, 0.3, 0.2, 0.2, 0, 0, 0.2, 0, 0.2, 0, 0.1, 0.1, 0.1, 0, 0.3,
0.2, 0.3)), .Names = c("x", "y", "value"), row.names = c(NA,
-25L), class = "data.frame")
This illustrates some of the problems I mentioned prior: value at x = 3 is 0.0, while y = 0.2 but there is no indicator of that since x = 4 and x = 2 have either value > 0.1 of y ==0
with geom_pointrage this would become:
Perhaps choosing the best from both worlds:
ggplot(daf,aes(x=x,y=y)) +
geom_line(colour='blue') +
geom_point() +
geom_area(data = for_area(daf, 0.1), fill= "skyblue", alpha = 0.3 )+
geom_linerange(data = daf[daf$value<0.1,], aes(ymin = 0, ymax = y), color= "skyblue", size = 1)
Upvotes: 1