Reputation: 43
I'm trying to identify the densest region in the plot. And I do this using stat_ellipse()
in ggplot2. But I can not get the information (sum total, order number of each point and so on) of the points inside of the ellipse.
Seldom see the discussion about this problem. Is this possible?
For example:
ggplot(faithful, aes(waiting, eruptions))+
geom_point()+
stat_ellipse()
Upvotes: 4
Views: 2480
Reputation: 35242
Here is Roman's suggestion implemented. The help for stat_ellipse
says it uses a modified version of car::ellipse
, so therefore I chose to extract the ellipse points from the ggplot
object. That way it should always be correct (also if you change options in stat_ellipse
).
# Load packages
library(ggplot2)
library(sp)
# Build the plot first
p <- ggplot(faithful, aes(waiting, eruptions)) +
geom_point() +
stat_ellipse()
# Extract components
build <- ggplot_build(p)$data
points <- build[[1]]
ell <- build[[2]]
# Find which points are inside the ellipse, and add this to the data
dat <- data.frame(
points[1:2],
in.ell = as.logical(point.in.polygon(points$x, points$y, ell$x, ell$y))
)
# Plot the result
ggplot(dat, aes(x, y)) +
geom_point(aes(col = in.ell)) +
stat_ellipse()
Upvotes: 5