CadisEtRama
CadisEtRama

Reputation: 1111

ggmap visualization of data with circles on map

I am trying to create a map that shows in circles the cities where subjects in my data set originated. I would like the circles to be proportional to the number of people in the city in my data. I would also like an additional circle to be a subset of the original circle showing the people in each city afflicted by the disease.

I have started doing this with ggmap by getting longitudes and latitudes:

library(ggplot2) 
library(maps)
library(ggmap)
geocode("True Blue, Grenada")

I'm stuck because I don't know how to continue. I can't load the US map alone because there is one location in the Caribbean.

here is my data in short format, the actual data set is far too large.

subjectid   location            disease
12          Atlanta, GA         yes
15          Boston, MA          no
13          True Blue, Grenada  yes
85          True Blue, Grenada  yes
46          Atlanta, GA         yes
569         Boston, MA          yes
825         True Blue, Grenada  yes
685         Atlanta, GA         no
54          True Blue, Grenada  no
214         Atlanta, GA         no
685         Boston, MA          no
125         True Blue, Grenada  yes
569         Boston, MA          no

can someone please help?

Upvotes: 2

Views: 2068

Answers (1)

Sandy Muspratt
Sandy Muspratt

Reputation: 32789

This should get you started. It does not plot circles within circles. ggplot can be made to map different variables to the same aesthetic (size), but with difficulty. Here, the size of the point represents the total count, and the colour of the point represents the number diseased. You will need to adjust the size scale for your full set of data.

The code below gets the geographic locations of the cities then merges them back into the data files. Then it summarises the data to give a data frame containing the required counts. The map is drawn with boundaries set by the maximum and minimum lon and lat of the cities. The last step is to plot the cities and the counts on the map.

# load libraries
library(ggplot2) 
library(maps)
library(ggmap)
library(grid)
library(plyr)

# Your data
df <- read.table(header = TRUE, text = "
subjectid   location           disease
12          'Atlanta, GA'         yes
15          'Boston, MA'          no
13          'True Blue, Grenada'  yes
85          'True Blue, Grenada'  yes
46          'Atlanta, GA'         yes
569         'Boston, MA'          yes
825         'True Blue, Grenada'  yes
685         'Atlanta, GA'         no
54          'True Blue, Grenada'  no
214         'Atlanta, GA'         no
685         'Boston, MA'          no
125         'True Blue, Grenada'  yes
569         'Boston, MA'          no", stringsAsFactors = FALSE)

# Get geographic locations and merge them into the data file
geoloc <- geocode(unique(df$location))
pos <- data.frame(location = unique(df$location), geoloc, stringsAsFactors = FALSE)
df <- merge(df, pos, by = "location", all = TRUE)

# Summarise the data file
df = ddply(df, .(location, lon, lat), summarise, 
   countDisease = sum(ifelse(disease == "yes", 1, 0)),
   countTotal = length(location))

# Plot the map
mp1 <- fortify(map(fill = TRUE, plot = FALSE))

xmin <- min(df$lon) - 5
xmax <- max(df$lon) + 7
ymin <- min(df$lat) - 5
ymax <- max(df$lat) + 5

Amap <- ggplot() + 
  geom_polygon(aes(x = long, y = lat, group = group), data = mp1, fill = "grey", colour = "grey") + 
  coord_cartesian(xlim = c(xmin, xmax), ylim = c(ymin, ymax)) + 
  theme_bw()

# Plot the cities and counts 
Amap <- Amap + geom_point(data = df, aes(x = lon, y = lat, size = countTotal, colour = countDisease)) +
    geom_text(data = df, aes(x = lon, y = lat, label = gsub(",.*$", "", location)), size = 2.5,  hjust = -.3) +
    scale_size(range = c(3, 10)) +
    scale_colour_continuous(low = "blue", high = "red", space = "Lab")

enter image description here

Upvotes: 1

Related Questions