Duck
Duck

Reputation: 39585

Adjust labels in ggplot2 and add another label in the top of bar in plot

Hi everybody I am working with a dataframe in R to build a nice graph. I have developed the graph but I have some problems with legends. Mi dataframe DF has the next form (I add the dput() version in the final side):

   Mes Estado Numero Label
1    2      X      7 22 (1.19%)
2    2      A     13 22 (1.19%)
3    2      Z      2 22 (1.19%)
4    3      X     19 30 (1.62%)
5    3      A     10 30 (1.62%)
6    3      Z      1 30 (1.62%)
7    4      X     19 31 (1.68%)
8    4      A     11 31 (1.68%)
9    4      Z      1 31 (1.68%)
10   5      X     17 28 (1.52%)
11   5      A      7 28 (1.52%)
12   5      Z      4 28 (1.52%)

It has 4 variables Mes, Estado, Numero, Label. I want to show the distribution of Estado with Mes according to the number of cases (Numero), so I build this graphic with the next code:

AAA=ggplot(DF, aes(x = Mes, y = Numero, fill = Estado)) +
  geom_bar(stat = "identity") + scale_y_continuous(labels = comma) + geom_text(aes(label=Numero),fontface="bold",size=6)
print(AAA)

enter image description here

How you can see the distribution of Estado for each value in Mes according to Numero works fine, but the problem is with labels. I would like to fix labels in each bar in the middle of their respective color. For example in the case of first bar 2 should be located in blue area, 7 in green area and 13 in rose area. But all labels are not in order. Moreover, I have in DF a variable named Label I am trying to add these values at the top side of each bar, for example in the case of Mes=2 label has a value of 22 (1.19%) that means the sum of all values in the bar and the value in parentheses is the relation between that sum and 1848 (22/1848). I would like to add that values in the top of each bar but when I tried to use another geom_text() with unique(PPP$Label) I got error. The dput version of DF is the next:

DF<-structure(list(Mes = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 
3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 
9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 
13L, 13L, 14L, 14L, 14L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 17L, 
18L, 18L, 19L, 20L), .Label = c("2", "3", "4", "5", "6", "7", 
"8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", 
"19", "20", "21"), class = "factor"), Estado = structure(c(2L, 
1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 
2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 
1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 1L, 3L, 
1L, 3L, 1L, 3L), .Label = c("A", "X", "Z"), class = "factor"), 
    Numero = c(7L, 13L, 2L, 19L, 10L, 1L, 19L, 11L, 1L, 17L, 
    7L, 4L, 19L, 8L, 7L, 11L, 13L, 15L, 8L, 3L, 13L, 13L, 8L, 
    6L, 14L, 4L, 11L, 14L, 5L, 3L, 4L, 3L, 5L, 12L, 6L, 2L, 9L, 
    4L, 2L, 6L, 5L, 1L, 5L, 2L, 1L, 2L, 3L, 5L, 2L, 3L, 2L, 1L, 
    1L), Label = c("22 (1.19%)", "22 (1.19%)", "22 (1.19%)", 
    "30 (1.62%)", "30 (1.62%)", "30 (1.62%)", "31 (1.68%)", "31 (1.68%)", 
    "31 (1.68%)", "28 (1.52%)", "28 (1.52%)", "28 (1.52%)", "34 (1.84%)", 
    "34 (1.84%)", "34 (1.84%)", "24 (1.3%)", "24 (1.3%)", "26 (1.41%)", 
    "26 (1.41%)", "26 (1.41%)", "34 (1.84%)", "34 (1.84%)", "34 (1.84%)", 
    "24 (1.3%)", "24 (1.3%)", "24 (1.3%)", "30 (1.62%)", "30 (1.62%)", 
    "30 (1.62%)", "10 (0.54%)", "10 (0.54%)", "10 (0.54%)", "23 (1.24%)", 
    "23 (1.24%)", "23 (1.24%)", "15 (0.81%)", "15 (0.81%)", "15 (0.81%)", 
    "13 (0.7%)", "13 (0.7%)", "13 (0.7%)", "8 (0.43%)", "8 (0.43%)", 
    "8 (0.43%)", "6 (0.32%)", "6 (0.32%)", "6 (0.32%)", "7 (0.38%)", 
    "7 (0.38%)", "5 (0.27%)", "5 (0.27%)", "1 (0.05%)", "1 (0.05%)"
    )), .Names = c("Mes", "Estado", "Numero", "Label"), row.names = c(NA, 
-53L), class = "data.frame")

Many thanks for your help.

Upvotes: 1

Views: 223

Answers (1)

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

First, we calculate the positions of the midpoints (NumeroPos) and the heights of the stacked bars (NumeroSum).

DF <- transform(DF, NumeroPos = ave(Numero, Mes, FUN = cumsum) - Numero / 2,
                NumeroSum = ave(Numero, Mes, FUN = sum))

Now, the new variables can be used for creating the labels. Note that we use a subset of the data frame for the labels on top of the bars since we need exactly one label for each bar.

library(ggplot2)
ggplot(DF, aes(x = Mes, y = Numero, fill = Estado)) +
  geom_bar(stat = "identity") + 
  geom_text(aes(label = Numero, y = NumeroPos), fontface = "bold", size = 6) +
  geom_text(data = DF[!duplicated(DF$Mes), ], 
            aes(y = NumeroSum, label = Label), vjust = -.5, size = 4)

enter image description here

Upvotes: 3

Related Questions