gofraidh
gofraidh

Reputation: 683

Using ggplot geom_line to visualize a difference between two years

I am trying to visualize a data set with number of countries, each having a variable for two years and a value for each of the year (world press freedom index). I already search through answers at stackoverflow and other sites but I a could not find anything that would help me with this. This is the data set after using ddplyr melt on it:

pfindex2narrow = reshape2::melt(pfindex2, id.vars = 'Origin')



 pfindex2narrow
             Origin variable value
1           Eritrea     2014 84.86
2        NorthKorea     2014 83.25
3      Turkmenistan     2014 80.83
4             Syria     2014 77.29
5             China     2014 73.55
6           Vietnam     2014 72.63
7             Sudan     2014 72.34
8              Iran     2014 72.32
9           Somalia     2014 72.31
10             Laos     2014 71.25
11         Djibouti     2014 71.04
12             Cuba     2014 70.21
13            Yemen     2014 66.36
14 EquatorialGuinea     2014 66.23
15       Uzbekistan     2014 61.14
16      SaudiArabia     2014 59.41
17          Bahrain     2014 58.69
18       Azerbaijan     2014 58.41
19           Rwanda     2014 56.57
20            Libya     2014 45.99
21          Eritrea     2013 84.83
22       NorthKorea     2013 81.96
23     Turkmenistan     2013 80.81
24            Syria     2013 77.04
25            China     2013 72.91
26          Vietnam     2013 72.36
27            Sudan     2013 71.88
28             Iran     2013 72.29
29          Somalia     2013 73.19
30             Laos     2013 71.22
31         Djibouti     2013 70.34
32             Cuba     2013 70.92
33            Yemen     2013 67.26
34 EquatorialGuinea     2013 67.95
35       Uzbekistan     2013 61.01
36      SaudiArabia     2013 58.30
37          Bahrain     2013 58.26
38       Azerbaijan     2013 52.87
39           Rwanda     2013 56.57
40            Libya     2013 39.84

The goal is to visualize the difference between each year's index and show whether it is following a decreasing or increasing trend. My own attempt is below. I am trying to use ggplot2 to visualize this, however, as you can see there are few issues (i.e. lines seem to be arbitrary and are not relating to real index values).

    b = ggplot(pfindex2narrow, aes(x = variable, y = value, group = Origin)) + 
  geom_line() + 
  geom_text(aes(label=value, hjust = 0.5), size = 4)
b + facet_wrap(~ Origin, ncol = 2) + 
  theme(
    axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 0.5),
    axis.text.y = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.ticks = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank()
  )

And this is the output:

Output of the code above

Sadly, I ran out of ideas how to fix this and I am kinda stuck. Maybe you have any ideas how to approach this.

Thank you!

Upvotes: 3

Views: 3456

Answers (1)

Axeman
Axeman

Reputation: 35377

Here is a few options:

First, let's shorten the data name a bit for less typing and easier reading. I also reordered the data, so it's easier to get a quick grasp of what's going on.

d <- pfindex2narrow
d$variable <- factor(d$variable)
d$Origin <- factor(
  d$Origin, 
  levels = (d$Origin)[rev(order(d$value[d$variable == '2014']))]
)

We can make a line plot, but it is quite confusing to look at:

ggplot(d, aes(x = variable, y = value, col = Origin, group = Origin)) + 
  geom_line(size = 1) + 
  scale_x_discrete(expand = c(0.1, 0), limits = c('2013', '2014', 'country')) +
  theme_bw()

enter image description here

I'm not really a fan. (It could look much better, if we would plot the ranks of countries instead.)

Perhaps we can use bars instead:

ggplot(d, aes(x = Origin, y = value, fill = variable)) + 
  geom_bar(stat = 'identity', position = 'dodge') + 
  theme_bw() + 
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))

enter image description here

Already better! But bars to 0 are perhaps not utilizing our space very well. Another option would be to use points.

d2 <- d
d2$x <- as.numeric(d2$Origin) + ifelse(d2$variable == '2013', -0.25, 0.25)
ggplot(d2, aes(x = Origin, y = value, col = variable)) + 
  geom_point(position = position_dodge(w = 1)) +
  geom_line(aes(x = x, group = Origin), col = 1) +
  theme_bw() + 
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))

enter image description here

And finally, if only the change is important, and not so much the absolute value, we can express that instead:

library(dplyr)
d3 <- d %>% 
  group_by(Origin) %>% 
  arrange(variable) %>% 
  summarize(dif = diff(value)) %>% 
  arrange(dif)
d3$Origin <- factor(d3$Origin, levels = unique(d3$Origin))

ggplot(d3, aes(Origin, dif, fill = Origin)) + 
  geom_bar(stat = 'identity', position = 'identity') +
  coord_flip() +
  theme_minimal() +
  guides(fill = 'none') +
  xlab('') + ylab('change from 2013 to 2014')

enter image description here

Upvotes: 8

Related Questions