stanley
stanley

Reputation: 597

How to group dots and limit number of dots per column with categorical data in geom_dotplot?

I want to produce a dotplot of categorical data, with 3 categories, to show the number of individuals in each of the three categories. I would like the first category ('type') to be on the y axis, with horizontal lines separating each type and the dots stacked on the lines, the second category (sex) to be on the x axis, and the third category (disease) to be the fill color.

I cant get the dots to group or stack correctly, when i get them to group correctly they are all in one long column which is too high, so i would like the dots to stack to a maximum of five dots then continue on a new column.

Here is code to reproduce the data:

df1 <- data.frame(sex="female", disease="yes", type="B") 
df2 <- data.frame(sex="female", disease="yes", type="C") 
df3 <- data.frame(sex="female", disease="no", type="B") 
df4 <- data.frame(sex="male", disease="yes", type="A") 
df5 <- data.frame(sex="male", disease="yes", type="B") 
df6 <- data.frame(sex="male", disease="yes", type="C") 
df7 <- data.frame(sex="male", disease="no", type="A") 
df8 <- data.frame(sex="male", disease="no", type="B") 
df9 <- data.frame(sex="male", disease="no", type="C") 
df10 <- data.frame(sex="male", disease="no", type="D") 
df2 <-df2[rep(nrow(df2), each = 3), ]
df3 <-df3[rep(nrow(df3), each = 29), ]
df5 <-df5[rep(nrow(df5), each = 3), ]
df8 <-df8[rep(nrow(df8), each = 35), ]
df9 <-df9[rep(nrow(df9), each = 7), ]
df.all<-rbind(df1, df2, df3, df4, df5, df6, df7, df8, df9, df10)

This is what I have so far, but the dots stack in the wrong direction and overlap, and I want them to be stacks of five dots.:

g<-ggplot(data=df.all)+
geom_dotplot(aes(x=sex, y=type, fill=disease),
binaxis= "y", 
stackgroups = F,
dotsize = 1,
stackdir = "up",
)
g

This is the kind of plot I am trying to produce (but with my actual data): desired result

Upvotes: 1

Views: 210

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50728

When I saw your mock plot I was immediately thinking of waffle charts. Here's something that might not be exactly what you had in mind, but perhaps it'll give you some ideas.

  1. We start by reformatting your sample data: convert character columns to factor columns, set levels according to our final order in the plot, and add counts. We then nest data per type & sex combination.
library(tidyverse)
df <- df.all %>%
    count(sex, type, disease) %>%
    mutate_if(is.character, as.factor) %>%
    mutate(
        sex = factor(sex, levels = rev(levels(sex))),
        disease = factor(disease, levels = rev(levels(disease)))) %>%
    complete(type, nesting(disease, sex), fill = list(n = 0)) %>%
    group_by(sex, type) %>%
    nest()
  1. We now generate a waffle::waffle for every type and sex, and store the ggplot objects in a list.
library(waffle)
lst <- df %>%
    mutate(data = map(
        data, ~ .x %>%
            deframe() %>%
            waffle(
                rows = 5, 
                size = 1, 
                colors=c("#c7d4b6", "#a3aabd")) +
            theme(legend.position = "none"))) %>%
    pull(data)
  1. We now layout the plot objects in a grid and add column and row labels (with the help of this post)
library(gridExtra)
library(grid)
combine <- rbind(
    tableGrob(
        t(df %>% pull(sex) %>% levels), theme = ttheme_minimal(), rows = ""), 
    cbind(tableGrob(
        df %>% pull(type) %>% levels, theme = ttheme_minimal()), 
        arrangeGrob(grobs = lst, ncol = 2),  size = "last"), size = "last")
grid.newpage()
grid.draw(combine)

enter image description here

Upvotes: 4

Related Questions