Reputation: 4443
Assuming I have the following dataframe
in R
:
df1 <- read.csv("jan.csv", stringsAsFactors = FALSE, header = TRUE)
str(df1)
'data.frame': 4 obs. of 5 variables:
$ JANUARY: chr "D-150" "D-90" "D-60" "D-30"
$ X2016 : num 0.24 0.5 0.63 0.76
$ X2017 : num 0.32 0.45 0.6 0.79
$ X2018 : num 0.2 0.4 0.61 0.82
$ X2019 : num 0.21 0.35 0.63 0.85
How can I use ggplot2
to output a graph like the one below (made in Excel
):
I am comfortable with producing a simple column chart
in ggplot2
but I am struggling to group the bars as shown above and place the relevant labels. Also, do I need to reshape the data to achieve this?
Upvotes: 3
Views: 320
Reputation: 7724
First I transform the data from wide to long format with gather
and then turn the original column names (X2016
, X2017
, ...) into a numeric variable with parse_number
. I use fct_inorder
to order the levels of JANUARY
in the order they appear.
library(tidyverse)
df1_long <- df1 %>%
gather(year, percentage, -JANUARY) %>%
mutate(year = parse_number(year),
JANUARY = fct_inorder(JANUARY))
df1_long
# JANUARY year percentage
# 1 D-150 2016 0.24
# 2 D-90 2016 0.50
# 3 D-60 2016 0.63
# 4 D-30 2016 0.76
# 5 D-150 2017 0.32
# 6 D-90 2017 0.45
# 7 D-60 2017 0.60
# 8 D-30 2017 0.79
# 9 D-150 2018 0.20
# 10 D-90 2018 0.40
# 11 D-60 2018 0.61
# 12 D-30 2018 0.82
# 13 D-150 2019 0.21
# 14 D-90 2019 0.35
# 15 D-60 2019 0.63
# 16 D-30 2019 0.85
This data can then be used for plotting.
ggplot(df1_long, aes(year, percentage, fill = JANUARY)) +
geom_col() +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, 1)) +
facet_wrap(~ JANUARY, nrow = 1, strip.position = "bottom") +
geom_text(aes(label = year), y = 0.1, angle = 90, color = "white") +
geom_text(aes(label = str_c(percentage*100, "%")), vjust = -0.5) +
ggtitle("Month of JANUARY") +
scale_fill_manual(values = c("darkblue", "darkgreen", "burlywood2", "darkorchid4")) +
theme_minimal() +
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title = element_blank(),
panel.spacing = unit(0, "cm"),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
legend.position = "none")
Data
df1 <- data.frame(JANUARY = c("D-150", "D-90", "D-60", "D-30"),
X2016 = c(0.24, 0.5, 0.63, 0.76),
X2017 = c(0.32, 0.45, 0.6, 0.79),
X2018 = c(0.2, 0.4, 0.61, 0.82),
X2019 = c(0.21, 0.35, 0.63, 0.85))
Upvotes: 1
Reputation: 467
Yes, is doable. But, first we need to have your data in real tabular format (as if you were exporting to sql).
So, this is your data:
January = c("D-150","D-90","D-60")
x2016 = c(0.24 , 0.5, 0.63)
x2017 = c(0.32 , 0.45, 0.6)
x2018 = c(0.2 , 0.4 , 0.61)
df1 <- data.frame(January,x2016,x2017,x2018)
To get it in a way to be plotted, we are gonna have to merge your year columns into 2 columns, as such:
library(tidyr)
nuevoDf1<-gather(data = df1, losAnhos,valores,-January)
Result will look like this:
January losAnhos valores
1 D-150 x2016 0.24
2 D-90 x2016 0.50
3 D-60 x2016 0.63
4 D-150 x2017 0.32
5 D-90 x2017 0.45
Finally, using ggplot2, you can start off your graph with:
ggplot(nuevoDf1,aes(losAnhos,valores)) +
facet_wrap(~January)+
geom_bar(stat="sum",na.rm=TRUE)
The result will be something like the one in the picture. I'm not a big fan of colors, but ggplot2 allows customizations after the plot has been constructed. Hope that sets you on the right path just to figure out the ephemeral and momentaneous beauty of the graph.
Upvotes: 2
Reputation: 2399
Yes you can. I think yours year labels aren't correct. Check my plot:
Here's the code that generates the plot:
library(tidyverse)
df1 %>%
gather(year, value, X2016:X2019) %>%
mutate(JANUARY = JANUARY %>% fct_rev() %>% fct_relevel('D-150')) %>%
group_by(JANUARY) %>%
mutate(y_pos = min(value) / 2) %>%
ggplot(aes(
x = JANUARY,
y = value,
fill = JANUARY,
group = year
)) +
geom_col(
position = position_dodge(.65),
width = .5
) +
geom_text(aes(
y = value + max(value) * .03,
label = round(value * 100) %>% str_c('%')
),
position = position_dodge(.65)
) +
geom_text(aes(
y = y_pos,
label = str_remove(year, 'X')
),
color = 'white',
angle = 90,
fontface = 'bold',
position = position_dodge(.65)
) +
scale_y_continuous(
breaks = seq(0, .9, .1),
labels = function(x) round(x * 100) %>% str_c('%')
) +
scale_fill_manual(values = c(
rgb(47, 85, 151, maxColorValue = 255),
rgb(84, 130, 53, maxColorValue = 255),
rgb(244, 177, 131, maxColorValue = 255),
rgb(112, 48, 160, maxColorValue = 255)
)) +
theme(
plot.title = element_text(hjust = .5),
panel.background = element_blank(),
panel.grid.major.y = element_line(color = rgb(.9, .9, .9)),
axis.ticks = element_blank(),
legend.position = 'none'
) +
xlab('') +
ylab('') +
ggtitle('Month of JANUARY')
Upvotes: 5
Reputation: 14360
With a little more data processing I think you can achieve what you want. We start by melting the data into long format which is what ggplot
requires for this type of plot. Then we create a separate labels dataset that contains the y-value (appears to be min within each "D" group):
df_m <- melt(df, id.vars = "JANUARY")
df_m$above_text <- scales::percent(df_m$value)
labels <- df_m
labels$value <- ave(labels$value, labels$JANUARY, FUN = function(x) min(x/2))
labels$variable <- sub("X", "", labels$variable)
pos_d <- position_dodge(width = 0.7)
ggplot(df_m, aes(x = JANUARY, y = value, group = variable, fill = JANUARY)) +
geom_col(width = 0.6, position = pos_d) +
geom_text(aes(label = above_text), position = pos_d, size = 2, hjust = 0.5, vjust = -1) +
geom_text(data = labels, aes(x = JANUARY, y = value, group = variable, label = variable), angle = 90, position = pos_d, hjust = 0.5)
Note you can play around with the % label size. What looks good depends on the actual dimensions of your image file. What looked OK for me was around 2.75 but that looked crowded copying as an image here.
Data:
df <- data.frame(JANUARY = c("D-150", "D-90", "D-60", "D-30"),
X2016 = c(0.24, 0.5, 0.63, 0.76),
X2017 = c(0.32, 0.45, 0.6, 0.79),
X2018 = c(0.2, 0.4, 0.61, 0.82),
X2019 = c(0.21, 0.35, 0.63, 0.85), stringsAsFactors = FALSE)
Upvotes: 3
Reputation: 27732
my approach
sample data
library( data.table )
dt <- fread('year "D-150" "D-90" "D-60" "D-30"
2016 0.24 0.5 0.63 0.76
2017 0.32 0.45 0.6 0.79
2018 0.2 0.4 0.61 0.82
2019 0.21 0.35 0.63 0.85', header = TRUE)
code
#first, melt
dt.melt <- melt( dt, id.vars = "year", variable.name = "Dvalue", value.name = "value" )
#create values (=positions in the chart) for the year-text within the bars.
dt.melt[, yearTextPos := min( value / 2 ), by = "Dvalue"]
#then build chart
library( ggplot2 )
library( scales)
ggplot( dt.melt, aes( x = Dvalue, y = value, group = year, fill = Dvalue ) ) +
#build the bars, dodged position
geom_col( width = 0.6, position = position_dodge(width = 0.75) ) +
#set up the y-scale
scale_y_continuous( limits = c(0,1), breaks = seq(0,1,0.1),
labels = scales::percent, expand = c(0,0) ) +
#insert year-text in bars, at the previuously calculated positions
geom_text( aes( x = Dvalue, y = yearTextPos, group = year, label = year ),
color = "white", position = position_dodge( width = 0.75 ),
hjust = 0.5, angle = 90, size = 5 ) +
#wite value on top as percentage
geom_text( aes( x = Dvalue, y = value + 0.01, group = year,
label = paste0( round( value * 100), "%" ) ),
color = "black", position = position_dodge( width = 0.75 ),
hjust = 0.5, angle = 0, size = 3 )
Upvotes: 2