Reputation: 1365
I am trying to build a ggplot hexbin plot where I can highlight hexbins where particular data occur by changing the border colour of those bins.
Here is some code to introduce the problem. Where x and y are the coordinates and group is the location of a subgroup of data points. I want to highlight all hexagons where there is a datapoint with group = 1 in it.
n = 1000
df = data.frame(x = rnorm(n),
y = rnorm(n),
group = sample(0:1, n, prob = c(0.9, 0.1), replace = TRUE))
ggplot(df ,aes(x = x, y = y)) +
geom_hex()
Upvotes: 0
Views: 189
Reputation: 124403
Not 100% sure how your final plot should look but one option to highlight hexagons containing an obs. with group = 1
would be to use stat_summary_hex
:
library(ggplot2)
set.seed(123)
ggplot(df, aes(x = x, y = y)) +
stat_summary_hex(aes(z = group), fun = ~ any(.x == 1))
EDIT And of course is also possible to use this approach to set the outline color. To this end we have to use after_stat
to map the value
- computed by stat_summary_hex
- on the color
aes and set the desired outline color via scale_color_manual
:
ggplot(df, aes(x = x, y = y)) +
geom_hex() +
stat_summary_hex(aes(
z = group,
color = after_stat(as.character(value))
), fun = ~ +any(.x == 1), fill = NA) +
scale_color_manual(
values = c("0" = "transparent", "1" = "yellow"),
guide = "none"
)
Upvotes: 3
Reputation: 17434
A bit different take, dealing with hex grid and grouping outside of ggplot
, with sf
package:
library(ggplot2)
library(dplyr)
library(sf)
set.seed(1)
n = 1000
df = data.frame(x = rnorm(n),
y = rnorm(n),
group = sample(0:1, n, prob = c(0.9, 0.1), replace = TRUE))
points_sf <- st_as_sf(df, coords = c("x","y"))
# generate 30x30 hex grid to match default bin size of stat_bin_hex(),
# add cell_id
hex_grid <- st_make_grid(points_sf, n = c(30,30), square = FALSE) %>%
st_as_sf() %>%
mutate(cell_id = row_number())
# spatial join to match points with cell_id,
# summarise without geometries (would generate multipoints),
# join hex geometries by cell_id
hex_cells <- points_sf %>%
st_join(hex_grid) %>%
st_drop_geometry() %>%
summarise(count = n(), includes_1 = any(group == 1), .by = cell_id) %>%
right_join(hex_grid, .)
#> Joining with `by = join_by(cell_id)`
hex_cells
#> Simple feature collection with 373 features and 3 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: -3.121687 ymin: -3.384439 xmax: 3.923915 ymax: 3.766982
#> CRS: NA
#> First 10 features:
#> cell_id count includes_1 x
#> 1 24 1 FALSE POLYGON ((-3.008049 -1.4161...
#> 2 43 1 FALSE POLYGON ((-2.89441 -0.82567...
#> 3 46 1 FALSE POLYGON ((-2.89441 0.355295...
#> 4 65 1 FALSE POLYGON ((-2.780771 0.55212...
#> 5 94 1 TRUE POLYGON ((-2.553494 -2.2034...
#> 6 97 1 FALSE POLYGON ((-2.553494 -1.0225...
#> 7 102 1 FALSE POLYGON ((-2.553494 0.94577...
#> 8 112 2 FALSE POLYGON ((-2.439855 -2.0066...
#> 9 115 1 FALSE POLYGON ((-2.439855 -0.8256...
#> 10 116 1 FALSE POLYGON ((-2.439855 -0.4320...
# adding higlight as a separate layer with constant color,
# controlling border color through aesthetics can introduce some artefacts when
# neighbouring cells are from different groups
ggplot(hex_cells) +
geom_sf(aes(fill = count)) +
geom_sf(data = ~ filter(.x, includes_1), color = "gold", fill = NA, linewidth = .5)
Created on 2023-07-14 with reprex v2.0.2
Upvotes: 3