Laura
Laura

Reputation: 306

How to make ggplot to order a stacked barchart

I have the following R code, where I transform the data and then order it by a specific column:

df2 <- df %>% 
group_by(V2, news) %>% 
tally() %>% 
complete(news, fill = list(n = 0)) %>% 
mutate(percentage = n / sum(n) * 100)

df22 <- df2[order(df2$news, -df2$percentage),]

I want to apply the ordered data "df22" in ggplot:

ggplot(df22, aes(x = V2, y = percentage, fill = factor(news, labels = c("Read","Otherwise")))) +
geom_bar(stat = "identity", position = "fill", width = .7) +
coord_flip() + guides(fill = guide_legend(title = "Online News")) + 
scale_fill_grey(start = .1, end = .6) + xlab("Country") + ylab("Share")

Unfortunately, ggplot still returns me a plot without the order:

enter image description here

Does anyone know what is wrong with my code? This is not the same as to order bar chart with a single value per bar like here Reorder bars in geom_bar ggplot2. I try to order the cart by a specific category of a factor. In particular, I want to see countries with the largest share of Read news first.

Here is the data:

               V2      news     n   percentage
 1  United States News Read  1583   1.845139
 2    Netherlands News Read  1536   1.790356
 3        Germany News Read  1417   1.651650
 4      Singapore News Read  1335   1.556071
 5  United States Otherwise  581    0.6772114
 6    Netherlands Otherwise  350    0.4079587
 7        Germany Otherwise  623    0.7261665     
 8      Singapore Otherwise  635    0.7401536

I used the following R code:

df2 <- df %>% 
group_by(V2, news) %>% 
tally() %>% 
complete(news, fill = list(n = 114)) %>% 
mutate(percentage = n / sum(n) * 100)

df2 <- df2[order(df2$news, -df2$percentage),]

df2 <- df2 %>% group_by(news, percentage) %>% arrange(desc(percentage)) 
df2$V2 <- factor(df2$V2, levels = unique(df2$V2)) 


ggplot(df2, aes(x = V2, y = percentage, fill = news))+ 
geom_bar(stat = "identity", position = "stack") + 
guides(fill = guide_legend(title = "Online News")) + 
coord_flip() + 
scale_x_discrete(limits = rev(levels(df2$V2)))

Everything was fine except some countries break the order for some reason and I do not understand why. Here is the picture:

enter image description here

What I did with the hints from guys, I used "arrange" command instead of dplyr

df4 <- arrange(df2, news, desc(percentage))

Here is the result:

enter image description here

Upvotes: 1

Views: 4065

Answers (1)

royr2
royr2

Reputation: 2299

Here's what I have - hope this is useful. As mentioned @Axeman - the trick is to reorder the labels as factors. Further, using coord_flip() reorders the labels in the opposite direction so scale_x_discrete() is needed.

I am using the small sample you provided.

library(ggplot2)
library(dplyr)

df <- read.csv("data.csv")

df <- arrange(df, news, desc(Percentage))
df$V2 <- factor(df$V2, levels = unique(df$V2)) 

ggplot(df, aes(x = V2, y = Percentage, fill = news))+ 
    geom_bar(stat = "identity", position = "stack") + 
    guides(fill = guide_legend(title = "Online News")) + 
    coord_flip() + 
    scale_x_discrete(limits = rev(levels(df$V2)))

enter image description here

Upvotes: 6

Related Questions