M. Beausoleil
M. Beausoleil

Reputation: 3555

Calculate the average of a cloud of points based on a grouping variable with sf in R

I have a bunch of points where I want to calculate the average summarized for each grouping variable:

x = st_sfc(st_polygon(list(rbind(c(0,0),c(90,0),c(90,90),c(0,90),c(0,0)))), crs = st_crs(4326))
plot(x, axes = TRUE, graticule = TRUE)
plot(p <- st_sample(x, 7), add = TRUE)
p=st_as_sf(p)
p$test=c("A","A","B","C","C","D","D")

When using dplyr, like this, I get an NA.

p %>%
  group_by(test) %>% 
  summarize(geometry = mean(geometry))

I just want the average into the geometry, not 1 point, nor multipoints.

Upvotes: 1

Views: 1067

Answers (1)

lovalery
lovalery

Reputation: 4652

Not sure to fully understand what you are looking for but I am giving it a try!

So, please find one possible solution with a reprex below using sf and dplyr libraries. I guess you were looking for the aggregate() function instead of group_by()

Reprex

  • Code
library(sf)
library(dplyr)

R1 <- p %>% aggregate(., 
                by = list(.$test), 
                function(x) x = x[1]) %>% 
  st_centroid() %>% 
  select(-Group.1)
#> Warning in st_centroid.sf(.): st_centroid assumes attributes are constant over
#> geometries of x
  • Output 1 (sf object)
R1          
#> Simple feature collection with 4 features and 1 field
#> Attribute-geometry relationship: 0 constant, 1 aggregate, 0 identity
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 2.7875 ymin: 12.91954 xmax: 59.60413 ymax: 51.81421
#> Geodetic CRS:  WGS 84
#>   test                  geometry
#> 1    A POINT (27.17167 12.91954)
#> 2    B   POINT (2.7875 22.54184)
#> 3    C POINT (59.60413 46.90029)
#> 4    D POINT (56.34763 51.81421)
  • Complementary code and Output 2 (i.e. if you just need a dataframe)
R2 <- R1 %>% 
  st_coordinates() %>% 
  cbind(st_drop_geometry(R1),.)

R2
#>   test        X        Y
#> 1    A 27.17167 12.91954
#> 2    B  2.78750 22.54184
#> 3    C 59.60413 46.90029
#> 4    D 56.34763 51.81421
  • Visualization
plot(x)
plot(p, add = TRUE)
plot(R1, pch = 15, add = TRUE)

Points are your data and small squares are centroids for each group (FYI, I set the seed to 427 for reproducibility purpose)


  • NB: The above uses spherical geometry. If you want to do planar computations you just need to add sf_use_s2(FALSE) at the beginning of the script. To show you the difference, here is the result using sf_use_s2(FALSE) (in this case, you can see that, for each group, the centroid is located precisely on the line connecting the two points; it is up to you to choose according to your needs)

Created on 2022-01-03 by the reprex package (v2.0.1)

Upvotes: 2

Related Questions