Will
Will

Reputation: 113

Violin plots with additional points

Suppose I make a violin plot, with say 10 violins, using the following code:

library(ggplot2)    
library(reshape2)  

df <- melt(data.frame(matrix(rnorm(500),ncol=10)))
p <- ggplot(df, aes(x = variable, y = value)) +
        geom_violin()
p

I can add a dot representing the mean of each variable as follows:

p + stat_summary(fun.y=mean, geom="point", size=2, color="red")

How can I do something similar but for arbitrary points?
For example, if I generate 10 new points, one drawn from each distribution, how could I plot those as dots on the violins?

Upvotes: 1

Views: 2696

Answers (3)

Qian Zhang
Qian Zhang

Reputation: 41

I had a similar problem. Code below exemplifies the toy problem - How does one add arbitrary points to a violin plot? - and solution.

## Visualize data set that comes in base R

head(ToothGrowth)

## Make a violin plot with dose variable on x-axis, len variable on y-axis

# Convert dose variable to factor - Important!
ToothGrowth$dose <- as.factor(ToothGrowth$dose)

# Plot
p <- ggplot(ToothGrowth, aes(x=dose, y=len)) +
             geom_violin(trim = FALSE) +
             geom_boxplot(width=0.1)

# Suppose you want to add 3 blue points
# [0.5, 10], [1,20], [2, 30] to the plot.
# Make a new data frame with these points 
# and add them to the plot with geom_point().

TrueVals <- ToothGrowth[1:3,]
TrueVals$len <- c(10,20,30)

# Make dose variable a factor - Important for positioning points correctly!
TrueVals$dose <- as.factor(c(0.5, 1, 2))

# Plot with 3 added blue points

p <- ggplot(ToothGrowth, aes(x=dose, y=len)) +
             geom_violin(trim = FALSE) +
             geom_boxplot(width=0.1) +
             geom_point(data = TrueVals, color = "blue")

Upvotes: 0

Richard Telford
Richard Telford

Reputation: 9923

You can give any function to stat_summary provided it just returns a single value. So one can use the function sample. Put extra arguments such as size, in the fun.args

 p + stat_summary(fun.y = "sample", geom = "point", fun.args = list(size = 1))

Upvotes: 2

r2evans
r2evans

Reputation: 160447

Assuming your points are qualified using the same group names (i.e., variable), you should be able to define them manually with:

newdf <- group_by(df, variable) %>% sample_n(10)
p + geom_point(data=newdf)

The points can be anything, including static numbers:

newdf <- data.frame(variable = unique(df$variable), value = seq(-2, 2, len=10))
p + geom_point(data=newdf)

Upvotes: 1

Related Questions