Tpellirn
Tpellirn

Reputation: 796

How to make barplot of groups in dataframes?

For this data:

class <- c(1, 2, 3, 2, 1, 4, 5, 4, 2, 4) 
prog <- c("Bac2", "Bac", "Master", "Bac", "Bac", "DEA", "Doctorat", "DEA", "Bac", "DEA")
  mydata <- data.frame(height = class, prog)

I want to make a plot like this. for example,

   all corresponding to bac2 is 1  so it is 100% of 1
   all corresponding to bac are 2,2,1,2 so it is 75% of 2 and 25% of 1


  mydata=structure(list(height = c(1, 2, 3, 2, 1, 4, 5, 4, 2, 4), prog = 
 c("Bac2", 
"Bac", "Master", "Bac", "Bac", "DEA", "Doctorat", "DEA", "Bac", 
"DEA")), class = "data.frame", row.names = c(NA, -10L))

Upvotes: 0

Views: 58

Answers (3)

jay.sf
jay.sf

Reputation: 72613

A succinct way using table and proportions first, then adapting the lengths to be able to create a matrix, order by max, and finally barplot.

p <- with(mydata, tapply(height, prog, \(x) proportions(table(x))))
lapply(p[order(-sapply(p, max))], `length<-`, max(lengths(p))) |>
  do.call(what=rbind) |> t() |> barplot(col=3:6)

enter image description here

Upvotes: 2

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a way.
Pre-compute the levels of prog so that "Bac2" comes before "Bac", like in the posted drawing, and how many unique height values are in the data to have the bars white.
Then plot the bars with position = "fill".

suppressPackageStartupMessages({
  library(dplyr)
  library(ggplot2)
})

mydata=structure(list(height = c(1, 2, 3, 2, 1, 4, 5, 4, 2, 4), prog = 
                        c("Bac2", 
                          "Bac", "Master", "Bac", "Bac", "DEA", "Doctorat", "DEA", "Bac", 
                          "DEA")), class = "data.frame", row.names = c(NA, -10L))

levs <- unique(mydata$prog)
nheight <- n_distinct(mydata$height)


mydata %>%
  mutate(prog = factor(prog, levels = levs)) %>%
  ggplot(aes(prog, fill = factor(height))) +
  geom_bar(position = "fill", colour = "black", show.legend = FALSE) +
  geom_text(aes(label = height), 
            stat = "count", 
            position = position_fill(vjust = 0.5)) +
  scale_fill_manual(values = rep("white", nheight)) +
  scale_y_continuous(labels = scales::percent)

Created on 2022-05-10 by the reprex package (v2.0.1)


Edit

y axis scale changed to a percent scale.

Upvotes: 1

YH Jang
YH Jang

Reputation: 1348

class <- c(1, 2, 3, 2, 1, 4, 5, 4, 2, 4) 
prog <- c("Bac2", "Bac", "Master", "Bac", "Bac", "DEA", "Doctorat", "DEA", "Bac", "DEA")
mydata <- data.frame(height = class, prog)
require(dplyr)
require(ggplot2)
require(forcats)

mydata %>% group_by(prog,height) %>% 
  tally() %>% mutate(prop = n/sum(n)) %>% 
  ggplot(aes(x=prog, y=prop, fill=fct_rev(as.factor(height))))+
  geom_col() +
  scale_x_discrete(labels=c('Bac2','Bac','Master', 'DEA','Doctorat'))+
  scale_y_continuous(labels = scales::percent)+
  theme(legend.position = 'null')

Created on 2022-05-10 by the reprex package (v2.0.1)

Upvotes: 3

Related Questions