Arturo
Arturo

Reputation: 354

Plotting on a geographical map the provenience of our patients

I am trying to put on a Italian geographical map a dot reporting the provenience ('provincia') of our patients. Ideally, the dot size should be proportional to the number of patients coming from that 'provincia'. An example of the list I would like to plot is the following.

MI  8319
CO  537
MB  436
VA  338
BG  310
PV  254
CR  244
NO  210
RM  189
CS  179

In the first column there is the 'provincia' code: MI (Milano), CO (Como), MB (Monza-Brianza), etc. In the second column there is the number of patients from that 'provincia'. So the output should be an Italian political map where the biggest dot is around the city of Milano (MI), the second biggest dot is near the city of Como (CO), the third one is around the city of Monza-Brianza (MB),etc. Is there any package that could do the plot I am looking for? I found a tool that could do the job here, but apparently they expect that I load the geographical coordinates in order to do the plot.

https://www.littlemissdata.com/blog/maps

Thanks in advance.

Upvotes: 1

Views: 498

Answers (2)

dieghernan
dieghernan

Reputation: 3412

UPDATE ON INSET MAP

This is an update following different suggestion on using inset maps, which I think it would be the best solution for yout question and comments:

library(sf)
library(cartography)


EU = st_read("~/R/mapslib/EUROSTAT/NUTS_RG_03M_2016_3035_LEVL_3.geojson")
IT = subset(EU, CNTR_CODE == "IT")
mydata <-
  structure(list(
    abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR",
             "NO", "RM", "CS"),
    value = c(8319L, 537L, 436L, 338L, 310L, 254L,
              244L, 210L, 189L, 179L),
    nuts = c("ITC4C","ITC42","ITC4D","ITC41",
             "ITC46", "ITC48","ITC4A","ITC15",
             "ITI43","ITF61")
  ),
  class = "data.frame",
  row.names = c(NA, -10L))

patients = merge(IT, mydata, by.x = "id", by.y = "nuts")

#Get breaks for map
br=getBreaks(patients$value)
#Delimit zone
#Based on NUTS1, Nortwest Italy
par(mar=c(0,0,0,0))
ghostLayer(IT[grep("ITC",IT$NUTS_ID),], bg="lightblue")
plot(st_geometry(EU), col="grey90", add=TRUE)
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464", add=TRUE)
choroLayer(
  patients,
  var = "value",
  breaks = br,
  col = carto.pal(pal1 = "red.pal", n1 = length(br)-1),
  legend.pos = "topleft",
  legend.title.txt = "Total patients",
  add = TRUE,
  legend.frame = TRUE
)
labelLayer(patients,txt="abbs", halo=TRUE, overlap = FALSE)

#Inset
par(
  fig = c(0, 0.4, 0.01, 0.4),
  new = TRUE
)
inset=patients[patients$abbs %in% c("RM","CS"),]
ghostLayer(inset, bg="lightblue")
plot(st_geometry(EU), col="grey90", add=TRUE)
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464", add=TRUE)
choroLayer(
  patients,
  var = "value",
  breaks = br,
  col = carto.pal(pal1 = "red.pal", n1 = length(br)-1),
  legend.pos = "n",
  add = TRUE
)
labelLayer(patients,txt="abbs", halo=TRUE, overlap = FALSE)
box(which = "figure", lwd = 1)

enter image description here

#RESTORE PLOT

par(fig=c(0,1,0,1))

OLD ANSWER

Following my comment on plotting labels, maybe with circles is not the best option for your map, given the concentration. I suggest you to use another kind of map for that, as chorolayer, I leveraged on https://stackoverflow.com/users/3304471/jazzurro for the dataframe.

    library(sf)
    library(cartography)

    EU = st_read("~/R/mapslib/EUROSTAT/NUTS_RG_03M_2016_3035_LEVL_3.geojson")
    IT = subset(EU, CNTR_CODE == "IT")
    mydata <-
      structure(list(
        abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR",
                 "NO", "RM", "CS"),
        value = c(8319L, 537L, 436L, 338L, 310L, 254L,
                  244L, 210L, 189L, 179L),
        nuts = c("ITC4C","ITC42","ITC4D","ITC41",
                 "ITC46", "ITC48","ITC4A","ITC15",
                 "ITI43","ITF61")
      ),
      class = "data.frame",
      row.names = c(NA, -10L))

    patients = merge(IT, mydata, by.x = "id", by.y = "nuts")

    #Options1 - With circles
    par(mar = c(0, 0, 0, 0))
    plot(st_geometry(IT), col = "#FEFEE9", border = "#646464")
    propSymbolsLayer(
      x = patients,
      var = "value",
      col = carto.pal(pal1 = "red.pal", n1 = 6),
      legend.title.txt = "Total patients",
      add = TRUE
    )

    #Option 2 - Chorolayer with labels
    par(mar = c(0, 0, 0, 0))
    plot(st_geometry(IT), col = "#FEFEE9", border = "#646464")
    choroLayer(
      patients,
      var = "value",
      col = carto.pal(pal1 = "red.pal", n1 = 6),
      legend.title.txt = "Total patients",
      add = TRUE
    )
    #Create labels
    patients$label = paste(patients$abbs, patients$value, sep = " - ")
    labelLayer(
      patients,
      txt = "label",
      overlap = FALSE,
      halo = TRUE,
      show.lines = TRUE,
      )

Options1 Options2

Data from https://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/nuts-2016-files.html

Upvotes: 0

jazzurro
jazzurro

Reputation: 23574

Here is one way to handle your task. You have the abbreviations for Italian province. You want to use them to merge your data with polygon data. If you download Italy's polygons from GADM, you can obtain data that contain the abbreviations. Specifically, the column, HASC_2 is the one. You need to merge your data with the polygon data. Then, you want to create another data set which contains centroid. You can draw a map with the two data sets.

library(tidyverse)
library(sf)
library(ggthemes)

# Get the sf file from https://gadm.org/download_country_v3.html
# and import it in R.

mysf <- readRDS("gadm36_ITA_2_sf.rds")

# This is your data, which is called mydata.
mydata <- structure(list(abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR", 
"NO", "RM", "CS"), value = c(8319L, 537L, 436L, 338L, 310L, 254L, 
244L, 210L, 189L, 179L)), class = "data.frame", row.names = c(NA, 
-10L))

   abbs value
1    MI  8319
2    CO   537
3    MB   436
4    VA   338
5    BG   310
6    PV   254
7    CR   244
8    NO   210
9    RM   189
10   CS   179

# Abbreviations are in HASC_2 in mysf. Manipulate strings so that
# I can join mydata with mysf with the abbreviations. I also get
# longitude and latitude with st_centroid(). This data set is for
# geom_point().

mysf2 <- mutate(mysf, HASC_2 = sub(x = HASC_2, pattern = "^IT.", replacement = "")) %>% 
         left_join(mydata, by = c("HASC_2" = "abbs")) %>% 
         mutate(lon = map_dbl(geometry, ~st_centroid(.x)[[1]]),
                lat = map_dbl(geometry, ~st_centroid(.x)[[2]]))

# Draw a map

ggplot() +
geom_sf(data = mysf) +
geom_point(data = mysf2, aes(x = lon, y = lat, size = value)) +
theme_map()

enter image description here

Upvotes: 2

Related Questions