Manuel
Manuel

Reputation: 87

Stacked bar plot based in 4 variables with ggplot2

I have a data frame like this:

nthreads ab_1 ab_2 ab_3 ab_4 ...
1        0    0    0    0    ...
2        1    0    12   1    ...
4        2    1    22   1    ...
8        10   2    103  8    ...

Each ab_X represents different causes that trigger an abort in my code. I want to summarize all abort causes in a barplot displaying nthreads vs aborts with different ab_X stacked in each bar.

I can do

ggplot(data, aes(x=factor(nthreads), y=ab_1+ab_2+ab_3+ab_4)) +
  geom_bar(stat="identity")

But it only gives the total number of aborts. I know there is a fill aes, but I can not make it work with continuous variables.

Upvotes: 0

Views: 1085

Answers (2)

neilfws
neilfws

Reputation: 33802

It gives the total number of aborts because you are adding them together :)

You need to get your data from wide to long format first, i.e. create one column for the abort causes and a second for their values. You can use tidyr::gather for that. I also find geom_col more convenient than geom_bar:

library(tidyr)
library(ggplot2)
data %>% 
  gather(abort, value, -nthreads) %>% 
  ggplot(aes(factor(nthreads), value)) + 
    geom_col(aes(fill = abort)) + 
    labs(x = "nthreads", y = "count")

Note that the range of values makes some of the bars rather hard to see, so you might want to think about scales and maybe even facets.

Upvotes: 1

amatsuo_net
amatsuo_net

Reputation: 2448

You have to melt the data frame first

library(data.table)
dt_melt <- melt(data, id.vars = 'nthreads')
ggplot(dt_melt, aes(x = nthreads, y = value, fill = variable)) + 
    geom_bar(stat = 'identity')

Upvotes: 2

Related Questions