Shakil Ahmed Shaon
Shakil Ahmed Shaon

Reputation: 119

How to add text in the stacked barplot using ggplot2 package of R?

I am analyzing a 5 point - Likert scale questionnaire and trying to visualize a stacked bar plot using ggplot2 of R.

The dataset could be found in this link - https://gofile.io/d/fKVZuL

The dataset of mine is in a .sav (SPSS) format.

So I am following codes to read this data:

require("foreign")
d = read.spss(file.choose(), to.data.frame=TRUE)
attach(d)

Now to plot this 5-point Likert scale into stacked bar plot I have used tidyverse package that includes ggplot2

require("tidyverse")
d %>% select(F1:F6) %>% na.omit %>% nrow
d %>% select(F1:F6) %>% na.omit -> f_items
f_items %>% gather(key = items, value = answer) %>% mutate(answer = factor(answer),items = factor(items)) -> data2

To rearrange the keys of the legend I have used the following codes:

data2$answer = factor(data2$answer, levels = c("Strongly Agree", "Agree", "Neutral",
                                           "Disagree", "Strongly Disagree"))

Then I created the stacked bar plot using the following codes:

ggplot(data2, aes(x = items)) +
geom_bar(aes(fill = answer), position = "fill") +
coord_flip() +
scale_x_discrete(limits = rev(levels(data2$items)))+
scale_y_continuous(labels = scales::percent)+
scale_fill_brewer(palette="RdYlBu")-> p2
p2

These codes produces this figure: enter image description here

Now I want to add the percentage of each of the questions responses like this figure but could not be able to manage the codes: enter image description here

How can I add the percentage of the questions responses like this figure? It would be a great help to me.

-Shakil

Upvotes: 1

Views: 1130

Answers (1)

Duck
Duck

Reputation: 39595

You can try next code. It is better if you process the data to have the labels and proportions as @Axeman told you early:

library(foreign)
library(tidyverse)
#Data
d = read.spss(file.choose(), to.data.frame=TRUE)
attach(d)
#Process
d %>% select(F1:F6) %>% na.omit %>% nrow
d %>% select(F1:F6) %>% na.omit -> f_items
f_items %>% gather(key = items, value = answer) %>% mutate(answer = factor(answer),items = factor(items)) -> data2
#Assign factor
data2$answer = factor(data2$answer, levels = c("Strongly Agree", "Agree", "Neutral",
                                               "Disagree", "Strongly Disagree"))
#Some code for proportions and labels
data2 %>% group_by(items,answer) %>% summarise(freq=n()) %>% ungroup() %>%
  group_by(items) %>% mutate(total = sum(freq),prop = freq/total) -> labdf
labdf %>% ungroup() -> labdf
#Create label
labdf$Label <- ifelse(labdf$prop<0.06,NA,paste0(100*round(labdf$prop,3),'%'))
#Plot
ggplot(labdf, aes(x = items, y = prop,group=answer))+
  geom_bar(stat='identity',aes(fill = answer), position = 'fill')+
  geom_text(aes(label = Label),position = position_fill(vjust = 0.5),size=3)+
  coord_flip() +
  scale_x_discrete(limits = rev(levels(data2$items)))+
  scale_y_continuous(labels = scales::percent)+
  scale_fill_brewer(palette="RdYlBu")-> p2
p2

Output:

enter image description here

There will be some proportions that are too short and the labels could overwrite others. That is why you can modify labdf$Label <- ifelse(labdf$prop<0.06,NA,paste0(100*round(labdf$prop,3),'%')) to decide which labels are kept.

Upvotes: 2

Related Questions