Matt
Matt

Reputation: 73

Ordering stacked bars in descending order

I have successfully made a stacked barplot in R where the percentages add up to 100% for several different categories. The dataframe looks like this:

 sujeito epentese vozeamento teste posicao palavra tipo  ortografia cseguinte
   <chr>   <chr>    <chr>      <chr> <chr>   <chr>   <chr> <chr>      <chr>    
 1 a       1        1          P     L       alpes   ps    ces        d_v      
 2 a       0        1          P     L       crepes  ps    ces        d_v      
 3 a       0        0          P     L       chopes  ps    ces        d_v      
 4 a       1        0          P     L       jipes   ps    ces        d_d      
 5 a       1        0          P     L       naipes  ps    ces        d_d      
 6 a       0        0          P     L       xaropes ps    ces        d_d      
 7 a       0        0          P     L       artes   ts    ces        d_v      
 8 a       0        0          P     L       botes   ts    ces        d_v      
 9 a       0        0          P     L       dentes  ts    ces        d_v      
10 a       0        0          P     L       potes   ts    ces        d_d      
# ... with 421 more rows

Then I used ggplot and deplyr to make a stacked barplot displaying these percentages. I used this code:

dadospb%>%
  group_by(tipo, epentese)%>%
  summarise(quantidade = n())%>%
  mutate(frequencia = quantidade/sum(quantidade))%>%
  ggplot(., aes(x = tipo, y = frequencia, fill = epentese))+
  geom_col(position = position_fill(reverse=FALSE))+
  geom_text(aes(label = if_else(epentese == 1, scales::percent(frequencia, accuracy = 1), "")), vjust = 0, nudge_y = .01) +
  scale_y_continuous(labels=scales::percent)+
  labs(title = "Epenthesis rates by cluster type on L1 Portuguese")+
  theme(plot.title = element_text(hjust = 0.5))+
  xlab("Cluster Type")+ylab("Frequency")

My intention, though, is to make it as the graph of the right side of this picture, with columns organized in a descending order: enter image description here

I have tried different packages and also manipulating group_by, but still no luck. I hope this isn't too redundant. The tutorials I've come across on the web which involve manipulating Tidyverse, to which I have elementary knowledge. Thanks in advance!

Upvotes: 1

Views: 1638

Answers (2)

Jon Spring
Jon Spring

Reputation: 66415

I like using the forcats package for ordering categories before they get into ggplot. In this case, we could use fct_inorder after sorting the data in order of epentese (so 0 appears first) and then frecuencia. Then it becomes an ordered factor and will plot in ggplot with that order. (See how cluster 4 comes before cluster 3 in my made-up data.)

I used mtcars but renamed to have your data's names:

library(dplyr); library(forcats)
# Prep to make mtcars look like your data
mtcars %>%
  mutate(vs = as.character(vs)) %>%
  group_by(tipo = carb, epentese = vs) %>%
  summarise(quantidade = sum(wt))%>%
  mutate(frequencia = quantidade/sum(quantidade)) %>%
  ungroup() %>%


  # Arrange in the way you want and then make tipo an ordered factor
  # I want epentese = 1 first, then descending frecuencia
  # When ggplot receives an ordered factor, it will display in order
  arrange(desc(epentese), -frequencia) %>%  
  mutate(tipo = tipo %>% as_factor %>% fct_inorder) %>%
  ...
  [Your ggplot code]
  

enter image description here

Upvotes: 5

tjebo
tjebo

Reputation: 23737

To help you translate the linked question and answer to your problem at hand -

``` r
library(tidyverse)
# devtools::install_github("alistaire47/read.so")
dadospb <- read.so::read_so("sujeito epentese vozeamento teste posicao palavra tipo  ortografia cseguinte
   <chr>   <chr>    <chr>      <chr> <chr>   <chr>   <chr> <chr>      <chr>    
 1 a       1        1          P     L       alpes   ps    ces        d_v      
 2 a       0        1          P     L       crepes  ps    ces        d_v      
 3 a       0        0          P     L       chopes  ps    ces        d_v      
 4 a       1        0          P     L       jipes   ps    ces        d_d      
 5 a       1        0          P     L       naipes  ps    ces        d_d      
 6 a       0        0          P     L       xaropes ps    ces        d_d      
 7 a       0        0          P     L       artes   ts    ces        d_v      
 8 a       0        0          P     L       botes   ts    ces        d_v      
 9 a       1       0          P     L       dentes  ts    ces        d_v      
10 a       0        0          P     L       potes   ts    ces        d_d ")

df1 <- 
  dadospb%>%
  group_by(tipo, epentese)%>%
  summarise(quantidade = n())%>%
  mutate(frequencia = quantidade/sum(quantidade))
#> `summarise()` has grouped output by 'tipo'. You can override using the `.groups` argument.

fac_order <- df1 %>%
filter(epentese ==1 ) %>%
  arrange(frequencia) %>%
  pull(tipo)

df1 %>%
  mutate(novotipo = factor(tipo, levels = fac_order)) %>%
  ggplot(aes(x = novotipo, y = frequencia, fill = epentese)) +
  geom_col() 

Created on 2021-02-13 by the reprex package (v1.0.0)

Upvotes: 0

Related Questions