SamPassmore
SamPassmore

Reputation: 1365

Highlight particular hex bins with geom_hex

I am trying to build a ggplot hexbin plot where I can highlight hexbins where particular data occur by changing the border colour of those bins.

Here is some code to introduce the problem. Where x and y are the coordinates and group is the location of a subgroup of data points. I want to highlight all hexagons where there is a datapoint with group = 1 in it.

n = 1000

df = data.frame(x = rnorm(n), 
                y = rnorm(n),
                group = sample(0:1, n, prob = c(0.9, 0.1), replace = TRUE))

ggplot(df ,aes(x = x, y = y)) + 
  geom_hex()

Upvotes: 0

Views: 189

Answers (2)

stefan
stefan

Reputation: 124403

Not 100% sure how your final plot should look but one option to highlight hexagons containing an obs. with group = 1 would be to use stat_summary_hex:

library(ggplot2)

set.seed(123)

ggplot(df, aes(x = x, y = y)) +
  stat_summary_hex(aes(z = group), fun = ~ any(.x == 1))

enter image description here

EDIT And of course is also possible to use this approach to set the outline color. To this end we have to use after_stat to map the value - computed by stat_summary_hex - on the color aes and set the desired outline color via scale_color_manual:

ggplot(df, aes(x = x, y = y)) +
  geom_hex() +
  stat_summary_hex(aes(
    z = group,
    color = after_stat(as.character(value))
  ), fun = ~ +any(.x == 1), fill = NA) +
  scale_color_manual(
    values = c("0" = "transparent", "1" = "yellow"),
    guide = "none"
  )

enter image description here

Upvotes: 3

margusl
margusl

Reputation: 17434

A bit different take, dealing with hex grid and grouping outside of ggplot, with sf package:

library(ggplot2)
library(dplyr)
library(sf)

set.seed(1)
n = 1000
df = data.frame(x = rnorm(n), 
                y = rnorm(n),
                group = sample(0:1, n, prob = c(0.9, 0.1), replace = TRUE))


points_sf <- st_as_sf(df, coords = c("x","y"))

# generate 30x30 hex grid to match default bin size of stat_bin_hex(),
# add cell_id
hex_grid <- st_make_grid(points_sf, n = c(30,30), square = FALSE) %>% 
  st_as_sf() %>% 
  mutate(cell_id  = row_number())

# spatial join to match points with cell_id, 
# summarise without geometries (would generate multipoints),
# join hex geometries by cell_id
hex_cells <- points_sf %>% 
  st_join(hex_grid) %>% 
  st_drop_geometry() %>% 
  summarise(count = n(), includes_1 = any(group == 1), .by = cell_id) %>% 
  right_join(hex_grid, .)
#> Joining with `by = join_by(cell_id)`

hex_cells
#> Simple feature collection with 373 features and 3 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -3.121687 ymin: -3.384439 xmax: 3.923915 ymax: 3.766982
#> CRS:           NA
#> First 10 features:
#>    cell_id count includes_1                              x
#> 1       24     1      FALSE POLYGON ((-3.008049 -1.4161...
#> 2       43     1      FALSE POLYGON ((-2.89441 -0.82567...
#> 3       46     1      FALSE POLYGON ((-2.89441 0.355295...
#> 4       65     1      FALSE POLYGON ((-2.780771 0.55212...
#> 5       94     1       TRUE POLYGON ((-2.553494 -2.2034...
#> 6       97     1      FALSE POLYGON ((-2.553494 -1.0225...
#> 7      102     1      FALSE POLYGON ((-2.553494 0.94577...
#> 8      112     2      FALSE POLYGON ((-2.439855 -2.0066...
#> 9      115     1      FALSE POLYGON ((-2.439855 -0.8256...
#> 10     116     1      FALSE POLYGON ((-2.439855 -0.4320...

# adding higlight as a separate layer with constant color, 
# controlling border color through aesthetics can introduce some artefacts when
# neighbouring cells are from different groups
ggplot(hex_cells) +
  geom_sf(aes(fill = count)) +
  geom_sf(data = ~ filter(.x, includes_1), color = "gold", fill = NA, linewidth = .5)

Created on 2023-07-14 with reprex v2.0.2

Upvotes: 3

Related Questions