How to separate a set of geographical regions to multiple groups that are similar to each other but also maintains geographic contiguity?

Question

I have recently attempted to do a regionalization analysis with a group of geographic regions, each contains multiple attributes (A1, A2, A3, ...). The goal is not like a regular regionalization problem (such as K-means) in which you define groups with minimal within group dissimilarity but maximal between group dissimilarity.

My regionalization is the opposite, I want the groups to be as similar as possible (although within group does not have to be as dissimilar as possible, but that is of less concern) in terms of means, variance, and other statistics. I ran into the minDiff package and its successor anticlust package in R, and it is able to do the job wonderfully except for one problem: since this is a regionalization problem, I would really want the final groups to be geographically connected. Results from minDiff/anticlust, however, show the different groups are mixed with one another all over the map. Here is a sample code:

A dataframe contains the geographic units and attributes is read from a shapefile and stored in geo.df.

geo.df<-as.data.frame(read_sf(dsn = getwd(), lay = "geolayer", stringsAsFactors = FALSE))

geo.df$class <- anticlustering(geo.df[, c("A1", "A2", "A3", "A4", ..., "An"), K = 5, objective = "variance", standardize = TRUE)

I've tried to include coordinates in the list of attributes (A1, A2, ..., An), pairwise distances, but none worked. I always ended up with well separated groups, but all mixed with one another in the geographic space.

Any pointers on how to proceed from here? Any hints will be greatly appreciated.

Thank you all in advance.

How to separate a set of geographical regions to multiple groups that are similar to each other but also maintains geographic contiguity?

Answers (1)

Related Questions