user3115933
user3115933

Reputation: 4443

Can this chart be created in R using ggplot2?

Assuming I have the following dataframe in R:

df1 <- read.csv("jan.csv", stringsAsFactors = FALSE, header = TRUE)
str(df1)

'data.frame':   4 obs. of  5 variables:
 $ JANUARY: chr  "D-150" "D-90" "D-60" "D-30"
 $ X2016  : num   0.24    0.5    0.63   0.76
 $ X2017  : num   0.32    0.45   0.6    0.79
 $ X2018  : num   0.2     0.4    0.61   0.82
 $ X2019  : num   0.21    0.35   0.63   0.85

How can I use ggplot2 to output a graph like the one below (made in Excel):

JANUARY

I am comfortable with producing a simple column chart in ggplot2 but I am struggling to group the bars as shown above and place the relevant labels. Also, do I need to reshape the data to achieve this?

Upvotes: 3

Views: 320

Answers (5)

kath
kath

Reputation: 7724

First I transform the data from wide to long format with gather and then turn the original column names (X2016, X2017, ...) into a numeric variable with parse_number. I use fct_inorder to order the levels of JANUARY in the order they appear.

library(tidyverse)

df1_long <- df1 %>% 
  gather(year, percentage, -JANUARY) %>% 
  mutate(year = parse_number(year), 
         JANUARY = fct_inorder(JANUARY)) 

df1_long

#    JANUARY year percentage
# 1    D-150 2016       0.24
# 2     D-90 2016       0.50
# 3     D-60 2016       0.63
# 4     D-30 2016       0.76
# 5    D-150 2017       0.32
# 6     D-90 2017       0.45
# 7     D-60 2017       0.60
# 8     D-30 2017       0.79
# 9    D-150 2018       0.20
# 10    D-90 2018       0.40
# 11    D-60 2018       0.61
# 12    D-30 2018       0.82
# 13   D-150 2019       0.21
# 14    D-90 2019       0.35
# 15    D-60 2019       0.63
# 16    D-30 2019       0.85

This data can then be used for plotting.

ggplot(df1_long, aes(year, percentage, fill = JANUARY)) +
  geom_col() +
  scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, 1)) +
  facet_wrap(~ JANUARY, nrow = 1, strip.position = "bottom") +
  geom_text(aes(label = year), y = 0.1, angle = 90, color = "white")  +
  geom_text(aes(label = str_c(percentage*100, "%")), vjust = -0.5) +
  ggtitle("Month of JANUARY") +
  scale_fill_manual(values = c("darkblue", "darkgreen", "burlywood2", "darkorchid4")) +
  theme_minimal() +
  theme(axis.text.x = element_blank(), 
        axis.ticks.x = element_blank(), 
        axis.title = element_blank(),
        panel.spacing = unit(0, "cm"),
        panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.position = "none")

enter image description here

Data

df1 <- data.frame(JANUARY = c("D-150", "D-90", "D-60", "D-30"),
                  X2016   = c(0.24, 0.5, 0.63, 0.76),
                  X2017   = c(0.32, 0.45, 0.6, 0.79),
                  X2018   = c(0.2, 0.4, 0.61, 0.82),
                  X2019   = c(0.21, 0.35, 0.63, 0.85))

Upvotes: 1

Jorge Lopez
Jorge Lopez

Reputation: 467

Yes, is doable. But, first we need to have your data in real tabular format (as if you were exporting to sql).

So, this is your data:

January = c("D-150","D-90","D-60")
x2016 = c(0.24 ,   0.5,    0.63)
x2017 = c(0.32  ,  0.45,   0.6)
x2018 = c(0.2   ,  0.4  ,  0.61)
df1 <- data.frame(January,x2016,x2017,x2018)

To get it in a way to be plotted, we are gonna have to merge your year columns into 2 columns, as such:

library(tidyr)
nuevoDf1<-gather(data = df1, losAnhos,valores,-January)

Result will look like this:

  January losAnhos valores 
1   D-150    x2016    0.24 
2    D-90    x2016    0.50 
3    D-60    x2016    0.63 
4   D-150    x2017    0.32 
5    D-90    x2017    0.45

Finally, using ggplot2, you can start off your graph with:

ggplot(nuevoDf1,aes(losAnhos,valores)) + 
  facet_wrap(~January)+
  geom_bar(stat="sum",na.rm=TRUE)

The result will be something like the one in the picture. I'm not a big fan of colors, but ggplot2 allows customizations after the plot has been constructed. Hope that sets you on the right path just to figure out the ephemeral and momentaneous beauty of the graph.

enter image description here

Upvotes: 2

Paweł Chabros
Paweł Chabros

Reputation: 2399

Yes you can. I think yours year labels aren't correct. Check my plot:

enter image description here

Here's the code that generates the plot:

library(tidyverse)

df1 %>%
  gather(year, value, X2016:X2019) %>%
  mutate(JANUARY = JANUARY %>% fct_rev() %>% fct_relevel('D-150')) %>%
  group_by(JANUARY) %>%
  mutate(y_pos = min(value) / 2) %>%
  ggplot(aes(
    x = JANUARY,
    y = value,
    fill = JANUARY,
    group = year
  )) +
  geom_col(
    position = position_dodge(.65),
    width = .5
  ) +
  geom_text(aes(
      y = value + max(value) * .03,
      label = round(value * 100) %>% str_c('%')
    ),
    position = position_dodge(.65)
  ) +
  geom_text(aes(
      y = y_pos,
      label = str_remove(year, 'X')
    ),
    color = 'white',
    angle = 90,
    fontface = 'bold',
    position = position_dodge(.65)
  ) +
  scale_y_continuous(
    breaks = seq(0, .9, .1),
    labels = function(x) round(x * 100) %>% str_c('%')
  ) +
  scale_fill_manual(values = c(
    rgb(47, 85, 151, maxColorValue = 255),
    rgb(84, 130, 53, maxColorValue = 255),
    rgb(244, 177, 131, maxColorValue = 255),
    rgb(112, 48, 160, maxColorValue = 255)
  )) +
  theme(
    plot.title = element_text(hjust = .5),
    panel.background = element_blank(),
    panel.grid.major.y = element_line(color = rgb(.9, .9, .9)),
    axis.ticks = element_blank(),
    legend.position = 'none'
  ) +
  xlab('') +
  ylab('') +
  ggtitle('Month of JANUARY')

Upvotes: 5

Mike H.
Mike H.

Reputation: 14360

With a little more data processing I think you can achieve what you want. We start by melting the data into long format which is what ggplot requires for this type of plot. Then we create a separate labels dataset that contains the y-value (appears to be min within each "D" group):

df_m <- melt(df, id.vars = "JANUARY")
df_m$above_text <- scales::percent(df_m$value)
labels <- df_m
labels$value <- ave(labels$value, labels$JANUARY, FUN = function(x) min(x/2))
labels$variable <- sub("X", "", labels$variable)
pos_d <- position_dodge(width = 0.7)

ggplot(df_m, aes(x = JANUARY, y = value, group = variable, fill = JANUARY)) + 
  geom_col(width = 0.6, position = pos_d) +
  geom_text(aes(label = above_text), position = pos_d, size = 2, hjust = 0.5, vjust = -1) + 
  geom_text(data = labels, aes(x = JANUARY, y = value, group = variable, label = variable), angle = 90, position = pos_d, hjust = 0.5)

enter image description here

Note you can play around with the % label size. What looks good depends on the actual dimensions of your image file. What looked OK for me was around 2.75 but that looked crowded copying as an image here.

Data:

df <- data.frame(JANUARY = c("D-150", "D-90", "D-60", "D-30"),
                 X2016   = c(0.24, 0.5, 0.63, 0.76),
                 X2017   = c(0.32, 0.45, 0.6, 0.79),
                 X2018   = c(0.2, 0.4, 0.61, 0.82),
                 X2019   = c(0.21, 0.35, 0.63, 0.85), stringsAsFactors = FALSE)

Upvotes: 3

Wimpel
Wimpel

Reputation: 27732

my approach

sample data

library( data.table )

dt <- fread('year  "D-150" "D-90" "D-60" "D-30"
2016   0.24    0.5    0.63   0.76
2017   0.32    0.45   0.6    0.79
2018   0.2     0.4    0.61   0.82
2019   0.21    0.35   0.63   0.85', header = TRUE)

code

#first, melt
dt.melt <- melt( dt, id.vars = "year", variable.name = "Dvalue", value.name = "value" )
#create values (=positions in the chart) for the year-text within the bars.
dt.melt[, yearTextPos := min( value / 2 ), by = "Dvalue"]

#then build chart
library( ggplot2 )
library( scales)
ggplot( dt.melt, aes( x = Dvalue, y = value, group = year, fill = Dvalue ) ) + 
  #build the bars, dodged position
  geom_col( width = 0.6, position = position_dodge(width = 0.75) ) +
  #set up the y-scale
  scale_y_continuous( limits = c(0,1), breaks = seq(0,1,0.1), 
                      labels = scales::percent, expand = c(0,0) ) +
  #insert year-text in bars, at the previuously calculated positions
  geom_text( aes( x = Dvalue, y = yearTextPos, group = year, label = year ), 
             color = "white", position = position_dodge( width = 0.75  ), 
             hjust = 0.5, angle = 90, size = 5 ) +
  #wite value on top as percentage
  geom_text( aes( x = Dvalue, y = value + 0.01, group = year, 
                  label = paste0( round( value * 100), "%" ) ), 
             color = "black", position = position_dodge( width = 0.75  ), 
             hjust = 0.5, angle = 0, size = 3 )

output enter image description here

Upvotes: 2

Related Questions