Nando
Nando

Reputation: 21

Stacked bar plot with percentages in separate columns

I am attempting to draw a stacked bar plot with the the following data, using either ggplot2 or the barplot function in r. I have failed with both.

str(ISCE_LENGUAJE5_APE_DEC)
'data.frame':   50 obs. of  5 variables:
$ Nombre             : Factor w/ 49 levels "C.E. DE BORAUDO",..: 6 5 25 21 16 7 27 45 24 38 ...
$ v2014_5L_porNivInsu: int  100 93 73 67 67 65 63 60 59 54 ...
$ v2014_5L_porNivMini: int  0 7 22 26 32 32 37 26 34 35 ...
$ v2014_5L_porNivSati: int  0 0 4 6 2 3 0 12 6 10 ...
$ v2014_5L_porNivAvan: int  0 0 1 2 0 0 0 2 1 2 ...

The integers are percentage values: them sum of the v2014... columns for each observation is 100.

I have attempted to use ggplot2, but I only manage to plot one of the variables, not the stacked bar with all four.

ggplot(ISCE_LENGUAJE5_APE_DEC, aes(x=Nombre, y= v2014_5L_porNivInsu)) + geom_bar(stat="identity")

I can't figure out how to pass the values for all four columns to the y parameter.

If I only pass x, I get an error:

ggplot(ISCE_LENGUAJE5_APE_DEC, aes(x=Nombre)) + geom_bar(stat="identity")
Error in exists(name, envir = env, mode = mode) : 
argument "env" is missing, with no default

I found this answer, but don't understand the data transformations used. Thank you for any help provided.

Upvotes: 0

Views: 707

Answers (1)

essicolo
essicolo

Reputation: 818

ggplot2 works with data expressed in "long" format. The function melt from package reshape2 is your friend.

Because you did not provide a reproducible example, I generated some data.

v2014 <- data.frame(v2014_5L_porNivInsu = sample(1:100, 50, replace = TRUE),
                    v2014_5L_porNivMini = sample(1:50, 50, replace = TRUE),
                    v2014_5L_porNivSati = sample(0:10, 50, replace = TRUE),
                    v2014_5L_porNivAvan = sample(0:2, 50, replace = TRUE))

v2014_prop <- t(apply(dummy[, -1], 1, function(x) {x / sum(x) * 100}))

ISCE_LENGUAJE5_APE_DEC <- data.frame(Nombre = factor(sample(1:100, 50)),
                                     v2014_prop)

You first express your table in long format using melt.

library(reshape2)
gg <- melt(ISCE_LENGUAJE5_APE_DEC, id = "Nombre")

See how your new table, gg, looks like.

str(gg)
head(gg)

In your ggplot, you use the data.frame gg. The x-axis is Nombre, the y-axis is value, i.e. the proportions, segmented by different fill colours defined from the variable column, where you find the v2014_... expressed as factor levels instead as column headers thanks to the melt function.

library(ggplot2)
ggplot(gg, aes(x = Nombre, y = value, fill = variable)) + 
  geom_bar(stat = "identity")

Upvotes: 1

Related Questions