Chris
Chris

Reputation: 87

How to plot this picture using ggplot2?

enter image description here

Above is my dataset, just a simple dataset. It shows the GDP per capita of the richest and the poorest regions in nine countries in 2000 and 2015 as well as the gap of GDP per capita between the poorest and richest regions. Below is the reproducible example of this dataset:

structure(list(Country = c("Britain", "Germany", "United State", 
"France", "South Korea", "Italy", "Japan", "Spain", "Sweden"), 
    Poor2000 = c(69, 50, 74, 52, 79, 50, 80, 80, 90), Poor2015 = c(61, 
    48, 73, 50, 73, 52, 78, 84, 82), Rich2000 = c(848, 311, 290, 
    270, 212, 180, 294, 143, 148), Rich2015 = c(1150, 391, 310, 
    299, 200, 198, 290, 151, 149)), row.names = c(NA, -9L), class = c("tbl_df", 
"tbl", "data.frame"))

I wanna make a plot like this:

enter image description here

In this plot I just wanna show the GDP per capita of the poorest regions in the nine countries in 2000 and 2015 (the draft picture just has three countries for the sake of convenience). But I don't know how to do it using ggplot. Because it seems like I need to set x-axis as "Country" and y-axis as "Poor2000" and "Poor2015" the two variables. I don't know how to do that. Thanks many in advance.

Upvotes: 1

Views: 69

Answers (1)

dc37
dc37

Reputation: 16178

Here a possible solution. Starting from your dataframe, you can first create a new dataframe that will reshape it into a longer format. FOr doing that, I used pivot_longer function from tidyr package:

library(tidyr)
library(dplyr)
DF <- df %>% select(Country, Poor2000, Poor2015) %>%
  mutate(Diff = Poor2015 - Poor2000) %>%
  pivot_longer(-Country, names_to = "Poor", values_to = "value")

# A tibble: 27 x 3
   Country       Poor     value
   <fct>         <chr>    <dbl>
 1 Britain       Poor2000    69
 2 Britain       Poor2015    61
 3 Britain       Diff        -8
 4 Germany       Poor2000    50
 5 Germany       Poor2015    48
 6 Germany       Diff        -2
 7 United States Poor2000    74
 8 United States Poor2015    73
 9 United States Diff        -1
10 France        Poor2000    52
# … with 17 more rows

We will also create a second dataframe that will contain the difference of values between Poor2000 and Poor2015:

DF_second_label <-  df %>% select(Country, Poor2000, Poor2015) %>%
  group_by(Country) %>% 
  mutate(Diff = Poor2015 - Poor2000, ypos = max(Poor2000,Poor2015))

# A tibble: 9 x 5
# Groups:   Country [9]
  Country       Poor2000 Poor2015  Diff  ypos
  <fct>            <dbl>    <dbl> <dbl> <dbl>
1 Britain             69       61    -8    69
2 Germany             50       48    -2    50
3 United States       74       73    -1    74
4 France              52       50    -2    52
5 South Korea         79       73    -6    79
6 Italy               50       52     2    52
7 Japan               80       78    -2    80
8 Spain               80       84     4    84
9 Sweden              90       82    -8    90

Then, we can plot both new dataframe in ggplot2 and select only countries of interest by using subset function:

ggplot(subset(DF, Poor != "Diff" & Country %in% c("Britain","South Korea","Sweden")), 
       aes(x = Country, y = value, fill = Poor))+
  geom_col(position = position_dodge())+
  geom_text(aes(label = value), position = position_dodge(0.9), vjust = -0.5, show.legend = FALSE)+
  geom_text(inherit.aes = FALSE, 
            data = subset(DF_second_label, Country %in% c("Britain","South Korea","Sweden")),
            aes(x = Country, 
                y = ypos+10,
                label = Diff), color = "darkgreen", size = 6, show.legend = FALSE)+
  labs(x = "", y = "GDP per Person", title = "Poor in 2000 & 2015")+
  theme(plot.title = element_text(hjust = 0.5))

And you get:

enter image description here


Reproducible example

df <- data.frame(Country = c("Britain","Germany", "United States", "France", "South Korea", "Italy","Japan","Spain","Sweden"),
                 Poor2000 = c(69,50,74,52,79,50,80,80,90),
                 Poor2015 = c(61,48,73,50,73,52,78,84,82),
                 Rich2000 = c(848,311,290,270,212,180,294,143,148))

Upvotes: 3

Related Questions