grace0726
grace0726

Reputation: 41

Creating bubble plot in R

I have imported data in Excel below to create a bubble plot in R:

# A tibble: 6 x 3
  Country       Series           `2019`
  <chr>         <chr>             <dbl>
1 United Kingdom GDP per capita  42354. 
2 United Kingdom Life Expectancy 81
3 United Kingdom Population (M)  67  
4 United States GDP per capita  65298. 
5 United States Life Expectancy   78.8
6 United States Population (M)    328  

I wrote a code but it does not plot anything. What can be done to produce a bubble plot?

bubble2 <- mutate_all(bubble, function(x) as.numeric(as.character(x))) %>% 
  pivot_longer(cols=-c("Country","Series"),names_to="year") %>%
  mutate(Year=as.numeric(year)) %>%
  select(-year) %>%
  
  ggplot(aes(x="GDP per capita", y="Life Expectancy", size="Population (M)", fill=Country)) +
    geom_point(alpha=0.5, shape=21, color="black") +
    scale_size(range = c(10, 1400), name="Population (M)") +
    scale_fill_viridis(discrete=TRUE, guide=FALSE, option="A") +
    ylab("Life Expectancy") +
    xlab("Gdp per capita") 

EDIT I added 10 more countries and adjusted code:

bubble2 <- bubble %>%
  pivot_wider(names_from = "Series", values_from = `2019`)

ggplot(bubble2, aes(x = `GDP per capita`, y = `Life Expectancy`, size = `Population (M)`, fill = Country)) +
  geom_point(alpha = 0.5, shape = 21, color = "black") +
  geom_text(aes(label = Country), size = 8 / .pt) +
  #ggrepel::geom_text_repel(aes(label = Country), size = 8 / .pt) +
  scale_size(range = c(.1, 24), name="Population (M)") +
 
  ylab("Life Expectancy") +
  xlab("Gdp per capita") + 
  theme(axis.title = element_text(size=8), plot.title = element_text(hjust = 0.5))

But the legend on Population changed. How can I display correct Population Legend and take out Country Legend?

Bubble Plot

Upvotes: 1

Views: 616

Answers (3)

stefan
stefan

Reputation: 123893

First. Converting everything to a numeric isn't a good idea if your dataset contains columns with strings. Second. You need pivot_wider instead of pivot_longer. Third. Use backticks ` instead of quotes around non-standard column names:

library(ggplot2)
library(dplyr)
library(tidyr)
library(viridis)

bubble2 <- bubble %>%
  pivot_wider(names_from = "Series", values_from = `2019`)

ggplot(bubble2, aes(x = `GDP per capita`, y = `Life Expectancy`, size = `Population (M)`, fill = Country)) +
  geom_point(alpha = 0.5, shape = 21, color = "black") +
  # scale_size(range = c(10, 1400), name = "Population (M)") +
  scale_fill_viridis(discrete = TRUE, guide = "none", option = "A") +
  ylab("Life Expectancy") +
  xlab("Gdp per capita")

EDIT: Adding labels via geom_text:

ggplot(bubble2, aes(x = `GDP per capita`, y = `Life Expectancy`, size = `Population (M)`, fill = Country)) +
  geom_point(alpha = 0.5, shape = 21, color = "black") +
  geom_text(aes(label = Country), size = 8 / .pt) +
  #ggrepel::geom_text_repel(aes(label = Country), size = 8 / .pt) +
  scale_fill_viridis(discrete = TRUE, guide = "none", option = "A") +
  ylab("Life Expectancy") +
  xlab("Gdp per capita")

DATA

bubble <- structure(list(Country = c(
  "United Kingdom", "United Kingdom",
  "United Kingdom", "United States", "United States", "United States"
), Series = c(
  "GDP per capita", "Life Expectancy", "Population (M)",
  "GDP per capita", "Life Expectancy", "Population (M)"
), `2019` = c(
  42354,
  81, 67, 65298, 78.8, 328
)), row.names = c(NA, -6L), class = c(
  "tbl_df",
  "tbl", "data.frame"
))
``

Upvotes: 3

jay.sf
jay.sf

Reputation: 72623

You could do this with less effort in another approach using reshape2::dcast to reshape your data into wide format, and then using base graphics.

library(reshape2)
plot(`Life Expectancy` ~ `GDP per capita`, dcast(bubble, Country ~ Series),
     cex=`Population (M)`/120, ylim=c(78.5, 81.5), main='Here might be your title')
pseq <- seq(100, 300, 50)
legend('topright', legend=pseq, pch=1, pt.cex=pseq/120, title='Population (M)')

enter image description here

The colors are actually not necessary, are they?


Data:

bubble <- structure(list(Country = c("United Kingdom", "United Kingdom", 
"United Kingdom", "United States", "United States", "United States"
), Series = c("GDP per capita", "Life Expectancy", "Population (M)", 
"GDP per capita", "Life Expectancy", "Population (M)"), `2019` = c(42354, 
81, 67, 65298, 78.8, 328)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 1

Mohanasundaram
Mohanasundaram

Reputation: 2949

In addition to the answer by @stefan, the size range is to large. Also Adding, the limits might help.

  bubble %>% pivot_wider(names_from = "Series", values_from = `2019`) %>% 
  ggplot(aes(x=`GDP per capita`, y=`Life Expectancy`, size=`Population (M)`, fill=Country)) +
  geom_point(alpha=0.5, shape=21, color="black") +
  scale_size(range = c(10, 40), name="Population (M)") +
  xlim(c(30000, 80000)) + ylim(c(77, 82)) +
  scale_fill_viridis(discrete=TRUE, guide=FALSE, option="A") +
  ylab("Life Expectancy") +
  xlab("Gdp per capita") +
  guides(size = FALSE)

enter image description here

Upvotes: 1

Related Questions