jjoannes
jjoannes

Reputation: 21

R sf & dplyr: summarise fails despite compatible types across groups

As of dplyr (0.8.3) and sf (0.8.0), the following was possible (see https://stackoverflow.com/a/49354480/9164265):

library(dplyr)
library(sf)

nc <- st_read(system.file("shape/nc.shp", package="sf"))
nc %>%
  group_by(SID74) %>%
  summarise(geometry = st_union(geometry)) %>%
  ungroup()

This would have had the effect of combining each geometry with the same SID74 into their own MULTIPOLYGONs.

However, this now (dplyr 1.0.0) gives the following error:

Error: Problem with `summarise()` input `geometry`.
x Input `geometry` must return compatible vectors across groups
ℹ Input `geometry` is `st_union(geometry)`.
ℹ Result type for group 1 (SID74 = 0): <sfc_MULTIPOLYGON>.
ℹ Result type for group 2 (SID74 = 1): <sfc_MULTIPOLYGON>.
Run `rlang::last_error()` to see where the error occurred.

Does anyone know why dplyr is throwing this error, despite the types evidently being of the same <sfc_MULTIPOLYGON> class? Thanks for any help!

Upvotes: 1

Views: 5397

Answers (3)

charlie_smit_
charlie_smit_

Reputation: 21

I received a similar error when I was trying to group dates that were either a numeric or NA_Date_. I resolved the problem by using NA instead of NA_Date_.

It's worth going through your code to try and spot where similar inconsistencies might be creeping in.

Upvotes: 0

Barry DeCicco
Barry DeCicco

Reputation: 331

I don't have a reproducible example for this (I'm under a deadline right now), but I'm getting a similar error when I run some code:

df <- df %>% group_by(case_id) %>% dplyr::mutate(status_official = last(na.omit(status)))

Where the data has from 1 to 30 rows per case_id. I think that the problem is that some of the cases don't have a non-missing value for status, while some so. When I filter out the rows with missing status values, I don't get an error.

Upvotes: 0

jjoannes
jjoannes

Reputation: 21

The error no longer appears when upgrading sf 0.8.0 --> 0.9.5. Although this does not explain the error (using dplyr 1.0.0 & sf 0.8.0) itself, it would make sense to upgrade all packages being used in conjunction with dplyr when the latter is being upgraded (especially by a major version as is the case here).

Upvotes: 1

Related Questions