Nate Miller
Nate Miller

Reputation: 386

Color portions of sf LINESTRING by variable

I would like color different portions of an sf LINESTRING using a variable with ggplot and geom_sf. I can do something similar using geom_path, but in geom_sf a similar approach does not appear to work. Can someone offer a possible approach?

Sample Data

library(sf)
library(ggplot2)
library(dplyr)
library(tibble)



df <- tibble(time = seq(1,21),
             lon = seq(-50,-30, 1) + rnorm(n = 21),
             lat = seq(10, 20, 0.5) + rnorm(n = 21),
             type = c(rep('A',5),rep('B',10), rep('A',6)))

Using tibble/dataframe:
Using a tibble and a basic ggplot I can do this, incorporating group = 1 and a single line will be drawn with different sections colored by type. This is the type of plot I would like to make, but using an sf object instead.

ggplot() +
  geom_path(data = df, 
            aes(lon, lat, color = type, group = 1))

Using sf object/LINESTRING

If I cast to a LINESTRING using the group_by and type I end up with two LINESTRINGs

df_sf <- st_as_sf(df, coords = c('lon','lat')) %>%
  st_set_crs(.,value = 4326) %>%
  group_by(type) %>%
  summarize(do_union = TRUE) %>%
  st_cast(.,'LINESTRING')

And then when I do variations on the code below I end up with two separate lines and the two portions with type = A are connected.

ggplot() +
  geom_sf(data = df_sf, 
            aes(color = type, group = 1))

Is there a way to achieve the ggplot + geom_path() type behavior, using a geom_sf() approach (so I can project the variables, etc.)?

Upvotes: 3

Views: 743

Answers (2)

Skyler Lewis
Skyler Lewis

Reputation: 33

Another option is to keep using geom_path to plot the lines, but then to use coord_sf to apply a spatial coordinate system to ggplot:

ggplot() +
geom_path(data = df, 
          aes(lon, lat, color = type, group = 1)) + 
coord_sf(crs = "EPSG:4326")

Plot using geom_path and coord_sf

If you want to project it into a different CRS, use st_transform to project the coordinates, then pull the projected coordinates and plot them using geom_path again.

df_projected <- 
  st_as_sf(df, coords = c("lon", "lat")) |>
  st_set_crs("EPSG:4326") |>
  st_transform("EPSG:32624") |>
  mutate(easting = st_coordinates(geometry)[, "X"],
         northing = st_coordinates(geometry)[, "Y"])

ggplot() +
  geom_path(data = df_projected, 
            aes(easting, northing, color = type, group = 1)) + 
  coord_sf(crs = "EPSG:32624")

Plot using geom_path and coord_sf with reprojected coordinates

Data:

df <- tibble(time = seq(1,21),
             lon = seq(-50,-30, 1) + {set.seed(123); rnorm(n = 21)},
             lat = seq(10, 20, 0.5) + {set.seed(456); rnorm(n = 21)},
             type = c(rep('A',5),rep('B',10), rep('A',6)))

Upvotes: 1

Z.Lin
Z.Lin

Reputation: 29095

I found an approach that splits a line into segments here, but that may be overkill for your use case...

Try this:

df_sf <- df %>%
  
  # ensure data is sorted along x-axis
  arrange(lon) %>%
  
  # detect each time type changes, & create a duplicate point with previous type
  mutate(change.type = tidyr::replace_na(lag(type) != type, FALSE)) %>%
  mutate(type = ifelse(change.type,
                       paste(lag(type), type, sep = ";"),
                       type) %>%
           strsplit(";")) %>%
  tidyr::unnest(cols = c(type)) %>%
  
  # create new group column that increments with every colour change
  mutate(change.type = tidyr::replace_na(lag(type) != type, FALSE)) %>%
  mutate(new.type = cumsum(change.type)) %>%
  
  st_as_sf(coords = c('lon', 'lat')) %>%
  st_set_crs(., value = 4326) %>%
  
  # group by both original type (for colour) & new type (for group)
  group_by(type, new.type) %>% 
  summarize(do_union = TRUE) %>%
  st_cast(.,'LINESTRING') %>%
  ungroup()

Compare results

cowplot::plot_grid(
  ggplot() +
    geom_path(data = arrange(df, lon), 
              aes(lon, lat, color = type, group = 1), size = 1) +
    ggtitle("geom_path approach") +
    theme(legend.position = "bottom"),
  
  ggplot() +
    geom_sf(data = df_sf, 
            aes(color = type, group = new.type), size = 1) +
    ggtitle("geom_sf approach") +
    theme(legend.position = "bottom"),

  nrow = 1
)

result

Data:

set.seed(123)
df <- tibble(time = seq(1,21),
             lon = seq(-50,-30, 1) + rnorm(n = 21),
             lat = seq(10, 20, 0.5) + rnorm(n = 21),
             type = c(rep('A', 5), rep('B', 10), rep('A', 6)))

Upvotes: 4

Related Questions