Ardor Orenda
Ardor Orenda

Reputation: 51

Color only specific countries based on values in a different data frame

I'm a beginner in R and I'm trying to make a world map, which would color specific countries based on their GDP per capita, which is stored in another data frame. Here is my code (found online):

install.packages(c("cowplot", "googleway", "ggplot2", "ggrepel", "ggspatial", "libwgeom", "sf", "rnaturalearth", "rnaturalearthdata", "rgeos"))

library("ggplot2")
theme_set(theme_bw())
library("sf")
library("rnaturalearth")
library("rnaturalearthdata")
library("rgeos")

world <- ne_countries(scale = "medium", returnclass = "sf")

ggplot(data = world) +
geom_sf() +
xlab("Longitude") + ylab("Latitude") +
ggtitle("World map", subtitle = paste0("(", length(unique(world$name)), " countries)"))

This brings a map with 241 countries. However, my data frame with GDPs only stores information on 182 countries. So, when trying to use FILL= I receive an error:

ggplot(data = world) +
geom_sf(aes(fill = GDP.data$`US$`)) +
scale_fill_viridis_c(option = "plasma", trans = "sqrt") 
Error: Aesthetics must be either length 1 or the same as the data (241): fill

How can I overcome this problem and still make R color those countries which I have in my data frame?

Thank you very much!

Upvotes: 0

Views: 1596

Answers (1)

Ben
Ben

Reputation: 30474

Here is a working example, and follows @stefan's advice about joining your data to the map data frame.

In this example, I created a limited data frame containing gdp information my_gdp on selected countries:

gdp_data <- data.frame(
  name = c("Australia", "China", "Brazil"),
  my_gdp = c(1.43, 13.61, 1.86)
)

       name my_gdp
1 Australia   1.43
2     China  13.61
3    Brazil   1.86

You can merge (or use dplyr::left_join) so that my_gdp will be added to your world data frame. Using all.x will make sure all countries still remain for plotting, and fill in NA where there is no gdp values.

plot_data <- merge(world, gdp_data, by = "name", all.x = TRUE)

Then, you use only this final data frame, plot_data, to create your plot. This will be easier to manage than referencing two different data frames in ggplot, and ensures you have the same number of rows of data for plotting countries and filling in gdp.

ggplot(data = plot_data) +
  geom_sf(aes(fill = my_gdp)) +
  scale_fill_viridis_c(option = "plasma", trans = "sqrt") +
  ggtitle("World map (GDP in trillions $)", subtitle = paste0("(", length(unique(world$name)), " countries)"))

Plot

gdp plot

Upvotes: 1

Related Questions