Reputation: 41
I have imported data in Excel below to create a bubble plot in R:
# A tibble: 6 x 3
Country Series `2019`
<chr> <chr> <dbl>
1 United Kingdom GDP per capita 42354.
2 United Kingdom Life Expectancy 81
3 United Kingdom Population (M) 67
4 United States GDP per capita 65298.
5 United States Life Expectancy 78.8
6 United States Population (M) 328
I wrote a code but it does not plot anything. What can be done to produce a bubble plot?
bubble2 <- mutate_all(bubble, function(x) as.numeric(as.character(x))) %>%
pivot_longer(cols=-c("Country","Series"),names_to="year") %>%
mutate(Year=as.numeric(year)) %>%
select(-year) %>%
ggplot(aes(x="GDP per capita", y="Life Expectancy", size="Population (M)", fill=Country)) +
geom_point(alpha=0.5, shape=21, color="black") +
scale_size(range = c(10, 1400), name="Population (M)") +
scale_fill_viridis(discrete=TRUE, guide=FALSE, option="A") +
ylab("Life Expectancy") +
xlab("Gdp per capita")
EDIT I added 10 more countries and adjusted code:
bubble2 <- bubble %>%
pivot_wider(names_from = "Series", values_from = `2019`)
ggplot(bubble2, aes(x = `GDP per capita`, y = `Life Expectancy`, size = `Population (M)`, fill = Country)) +
geom_point(alpha = 0.5, shape = 21, color = "black") +
geom_text(aes(label = Country), size = 8 / .pt) +
#ggrepel::geom_text_repel(aes(label = Country), size = 8 / .pt) +
scale_size(range = c(.1, 24), name="Population (M)") +
ylab("Life Expectancy") +
xlab("Gdp per capita") +
theme(axis.title = element_text(size=8), plot.title = element_text(hjust = 0.5))
But the legend on Population changed. How can I display correct Population Legend and take out Country Legend?
Upvotes: 1
Views: 616
Reputation: 123893
First. Converting everything to a numeric isn't a good idea if your dataset contains columns with strings. Second. You need pivot_wider
instead of pivot_longer
. Third. Use backticks ` instead of quotes around non-standard column names:
library(ggplot2)
library(dplyr)
library(tidyr)
library(viridis)
bubble2 <- bubble %>%
pivot_wider(names_from = "Series", values_from = `2019`)
ggplot(bubble2, aes(x = `GDP per capita`, y = `Life Expectancy`, size = `Population (M)`, fill = Country)) +
geom_point(alpha = 0.5, shape = 21, color = "black") +
# scale_size(range = c(10, 1400), name = "Population (M)") +
scale_fill_viridis(discrete = TRUE, guide = "none", option = "A") +
ylab("Life Expectancy") +
xlab("Gdp per capita")
EDIT: Adding labels via geom_text
:
ggplot(bubble2, aes(x = `GDP per capita`, y = `Life Expectancy`, size = `Population (M)`, fill = Country)) +
geom_point(alpha = 0.5, shape = 21, color = "black") +
geom_text(aes(label = Country), size = 8 / .pt) +
#ggrepel::geom_text_repel(aes(label = Country), size = 8 / .pt) +
scale_fill_viridis(discrete = TRUE, guide = "none", option = "A") +
ylab("Life Expectancy") +
xlab("Gdp per capita")
DATA
bubble <- structure(list(Country = c(
"United Kingdom", "United Kingdom",
"United Kingdom", "United States", "United States", "United States"
), Series = c(
"GDP per capita", "Life Expectancy", "Population (M)",
"GDP per capita", "Life Expectancy", "Population (M)"
), `2019` = c(
42354,
81, 67, 65298, 78.8, 328
)), row.names = c(NA, -6L), class = c(
"tbl_df",
"tbl", "data.frame"
))
``
Upvotes: 3
Reputation: 72623
You could do this with less effort in another approach using reshape2::dcast
to reshape your data into wide format, and then using base graphics.
library(reshape2)
plot(`Life Expectancy` ~ `GDP per capita`, dcast(bubble, Country ~ Series),
cex=`Population (M)`/120, ylim=c(78.5, 81.5), main='Here might be your title')
pseq <- seq(100, 300, 50)
legend('topright', legend=pseq, pch=1, pt.cex=pseq/120, title='Population (M)')
The colors are actually not necessary, are they?
Data:
bubble <- structure(list(Country = c("United Kingdom", "United Kingdom",
"United Kingdom", "United States", "United States", "United States"
), Series = c("GDP per capita", "Life Expectancy", "Population (M)",
"GDP per capita", "Life Expectancy", "Population (M)"), `2019` = c(42354,
81, 67, 65298, 78.8, 328)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
Upvotes: 1
Reputation: 2949
In addition to the answer by @stefan, the size range is to large. Also Adding, the limits might help.
bubble %>% pivot_wider(names_from = "Series", values_from = `2019`) %>%
ggplot(aes(x=`GDP per capita`, y=`Life Expectancy`, size=`Population (M)`, fill=Country)) +
geom_point(alpha=0.5, shape=21, color="black") +
scale_size(range = c(10, 40), name="Population (M)") +
xlim(c(30000, 80000)) + ylim(c(77, 82)) +
scale_fill_viridis(discrete=TRUE, guide=FALSE, option="A") +
ylab("Life Expectancy") +
xlab("Gdp per capita") +
guides(size = FALSE)
Upvotes: 1