becbot
becbot

Reputation: 173

Stacked Barplot with Frequency Counts ggplot2

I have a dataset where I have several conditions, and I want to create a stacked bar graph showing the frequency of errors occurring in each condition. (so the number of cases in each condition where 1 error occurred, 2 errors occurred 3 errors occurred... etc etc.)

In theory, I understand the principle of creating bar graphs with ggplot2. However, the problem I am having is that the 'frequency' count is not an actual variable in the data frame (as it requires counting the number of cases). I'm not sure how to add it in to the gpplot2 framework (potentially using the 'stat' function, but I'm not so certain how this works).

I checked out the following similar questions:

How to barplot frequencies with ggplot2?

R stacked % frequency histogram with percentage of aggregated data based on

Display frequency instead of count with geom_bar() in ggplot

How to label stacked histogram in ggplot

But none of them really provide the answer I'm looking for (i.e., how to count the number of cases for each 'error' and include that into the ggplot2 code.

Below are some of my attempts with example data

library(tidyverse)

condition <- c("condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3")
number_of_errors <- c(1,2,3,3,2,1,4,4,5,4,5,1,2,2,3)

df <- data.frame(condition, number_of_errors)
df

df_melt <-melt(df) #This creates a data frame with 3 columns, 'condition', 'variable' and 'value' where 'variable' just says 'number_of_errors' for each row


# Attempt 1 - (Error: stat_bin() can only have an x or y aesthetic.)
ggplot(df_melt, aes(x=condition, y = variable, fill=value)) + 
  geom_bar(stat="bin", position="stack") +
  xlab("Condition") + 
  ylab("Frequency of Errors")


# Attempt 2 (produces a graph, but not a stacked one, just the total number of cases in each condition)
ggplot(df_melt, aes(x = condition, fill = value, label = value)) +
  geom_bar(col="black") +
  stat_count(position="stack")


# Attempt 3 (also produces a graph, but again not a stacked one - I think it is the sum of the number of errors?)
ggplot(df_melt,aes(factor(condition),y=as.numeric(value))) + 
  geom_bar(stat = "identity", position = "stack")

I am certain I must be missing something obvious about how to create values for the counts, but I'm not sure what. Any guidance is appreciated :)

Upvotes: 1

Views: 10857

Answers (2)

Chuck P
Chuck P

Reputation: 3923

I think the key for you might be to convert number_of_errors to a factor and make geom_bar(stat="count") you may also beenfit from this tutorial

library(ggplot2)
df$number_of_errors <- factor(df$number_of_errors)

ggplot(df, aes(x=condition, fill = number_of_errors)) +
  geom_bar(stat="count")

Upvotes: 3

Duck
Duck

Reputation: 39595

Maybe you are looking for this style of plot. You need to group by condition and then assign a value so that the bars can be designed. Here the code:

library(tidyverse)
#Data
condition <- c("condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3")
number_of_errors <- c(1,2,3,3,2,1,4,4,5,4,5,1,2,2,3)
df <- data.frame(condition, number_of_errors)
#Code
df %>% group_by(condition) %>% mutate(Number=factor(1:n())) %>%
  ggplot(aes(x=condition,y=number_of_errors,fill=Number,group=Number))+
  geom_bar(stat = 'identity')+
  geom_text(aes(label=number_of_errors),position = position_stack(0.5))+
  theme(legend.position = 'none')

Output:

enter image description here

Upvotes: 3

Related Questions