ben_p_4370
ben_p_4370

Reputation: 53

How to use for loops inside of ggplot2 with maps?

I'm not sure if a for loop is even the answer here (I'm quite new to R), but I'm hoping someone can advise me. Basically, I have a dataframe with three columns: city, latitude, and longitude. Each row represents an incident that took place in the given city, so each city appears more than once, and the number of times a city appears represents the number of incidents that took place in that city:

df1 <- data.frame(city = c("Alexandria", "Cairo", "Luxor", "Luxor", "Alexandria", "Cairo", "Luxor", "Cairo", "Luxor"),
                  latitude = c(31.1977, 30.0435, 25.6833, 25.6833, 31.1977, 30.0435, 25.6833, 30.0435, 25.6833),
                  longitude = c(29.8925, 31.2353, 32.65, 32.65, 29.8925, 31.2353, 32.65, 31.2353, 32.65)

(In reality I'm dealing with a csv file with thousands of rows, but the structure is the same). What I want to do is use ggplot and rnaturalearth to create a map plot where a point appears on the location of each city represented in the dataframe (based on the latitude and longitude coordinates corresponding to each city), but the size and color of the dot differ based on the number of times each city appears in the dataframe (i.e. the more appearances, the larger and darker red the dot).

I've gotten as far as the code below, which produces the map with the dots on each city represented, but obviously doesn't alter the size and color of the dot based on the number of appearances of the city. Could anyone help me figure out how to do this? I thought maybe this would involve a for loop that loops through unique(df1$city) and finds length(subset(df1)) for each item in unique(df1$city) and uses that to populate the size and fill arguments, but I'm not sure how to do it. Thanks very much in advance.

install.packages(c("cowplot", "googleway", "ggplot2", "ggrepel", 
                   "ggspatial", "libwgeom", "sf", "rnaturalearth", "rnaturalearthdata", "rgeos"))


library("ggplot2")
library("sf")
library("rnaturalearth")
library("rnaturalearthdata")


df1 <- data.frame(city = c("Alexandria", "Cairo", "Luxor", "Luxor", "Alexandria", "Cairo", "Luxor", "Cairo", "Luxor"),
                  latitude = c(31.1977, 30.0435, 25.6833, 25.6833, 31.1977, 30.0435, 25.6833, 30.0435, 25.6833),
                  longitude = c(29.8925, 31.2353, 32.65, 32.65, 29.8925, 31.2353, 32.65, 31.2353, 32.65)
)

world <- ne_countries(scale = "medium", returnclass = "sf")

ggplot(data = world)+
  geom_sf() +
  geom_point(data = df1, aes(x = df1$longitude, y = df1$latitude), size = 4,
             shape = 25, fill = "darkred")+
  coord_sf(xlim = c(24.6, 37.0), ylim = c(21.9, 32.0), expand = FALSE)

Upvotes: 1

Views: 212

Answers (1)

stefan
stefan

Reputation: 124183

There is no need of a for loop. Instead I would suggest you aggregate your data using e.g. dplyr::count which gives you dataset with one row per city and a new column n with the number of incidents. This new variable can then be mapped on size and fill:

library("ggplot2")
library("sf")
library("rnaturalearth")
library("rnaturalearthdata")
library(dplyr)

df1 <- data.frame(city = c("Alexandria", "Cairo", "Luxor", "Luxor", "Alexandria", "Cairo", "Luxor", "Cairo", "Luxor"),
                  latitude = c(31.1977, 30.0435, 25.6833, 25.6833, 31.1977, 30.0435, 25.6833, 30.0435, 25.6833),
                  longitude = c(29.8925, 31.2353, 32.65, 32.65, 29.8925, 31.2353, 32.65, 31.2353, 32.65)
)

world <- ne_countries(scale = "medium", returnclass = "sf")

# Aggregate data
df2 <- df1 %>% 
  count(city, latitude, longitude)

ggplot(data = world)+
  geom_sf() +
  geom_point(data = df2, aes(x = longitude, y = latitude, size = n, fill = n),
             shape = 25)+
  coord_sf(xlim = c(24.6, 37.0), ylim = c(21.9, 32.0), expand = FALSE)

Upvotes: 2

Related Questions