ahs85
ahs85

Reputation: 1977

Violin Plot (geom_violin) with aggregated values

I would like to create violin plots with aggregated data. My data has a category, a value coloumn and a count coloumn:

data <- data.frame(category = rep(LETTERS[1:3],3),
                   value = c(1,1,1,2,2,2,3,3,3),
                   count = c(3,2,1,1,2,3,2,1,3))

If I create a simple violin plot it looks like this:

plot <- ggplot(data, aes(x = category, y = value)) + geom_violin()
plot


(source: ahschulz.de)

That is not what I wanted. A solution would be to reshape the dataframe by multiplying the rows of each category-value combination. The problem is that my counts go up to millions which takes hours to be plotted! :-(

Is there a solution with my data?

Thanks in advance!

Upvotes: 5

Views: 3356

Answers (2)

Andy W
Andy W

Reputation: 5089

You can submit a weight when calculating the areas.

plot2 <- ggplot(data, aes(x = category, y = value, weight = count)) + geom_violin()
plot2

You will get warning messages that the weights do not add to one, but that is ok. See here for similar/related discussion.

enter image description here

Upvotes: 7

Ben Bolker
Ben Bolker

Reputation: 226087

Using stat="identity" and specifying a violinwidth aesthetic appears to work,although I had to put in a fudge factor:

ggplot(data, aes(x = category, y = value)) + 
   geom_violin(stat="identity",aes(violinwidth=0.2*count))

Upvotes: 2

Related Questions