dan
dan

Reputation: 6314

Add fitted regression line to R plotly violins

I have xy data for three groups across three integer ages:

set.seed(1)
df <- data.frame(value = c(rnorm(500,8,1),rnorm(600,6,1.5),rnorm(400,4,0.5),rnorm(500,2,2),rnorm(400,4,1),rnorm(600,7,0.5),rnorm(500,3,1),rnorm(500,3,1),rnorm(500,3,1)),
                 age = c(rep(3,500),rep(8,600),rep(24,400),rep(3,500),rep(8,400),rep(24,600),rep(3,500),rep(8,500),rep(24,500)),
                 group = c(rep("A",1500),rep("B",1500),rep("C",1500)))

My purpose is to use R's plotly to plot the values as violins, like this:

library(plotly)
library(dplyr)
df$age <- factor(df$age)
plot_ly(x=df$group,y=df$value,type='violin',name=df$age,color=df$age,box=list(visible=T)) %>%
  layout(violinmode='group')

enter image description here

And add to each group the lm fitted line, and if possible also the standard error lines.

So I first add to df the fitted lines per each group:

df$age <- as.integer(df$age)
df$fitted.value <- unlist(lapply(c("A","B","C"),function(g) lm(value ~ age,dplyr::filter(df,group == g)) %>% fitted.values()))

Then I tried to only do a single group's violin adding it's fitted line using:

df <- df %>% dplyr::filter(group == "A")
plot_ly(x=df$age,y=df$value,type='violin',color=df$age,box=list(visible=T)) %>%
  add_trace(x=df$age,y=df$fitted.value,mode="lines")

But this gives this long list of warnings:

Warning messages:
1: 'violin' objects don't have these attributes: 'mode'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'uid', 'ids', 'customdata', 'meta', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'uirevision', 'y', 'x', 'x0', 'y0', 'name', 'orientation', 'bandwidth', 'scalegroup', 'scalemode', 'spanmode', 'span', 'line', 'fillcolor', 'points', 'jitter', 'pointpos', 'width', 'marker', 'text', 'hovertext', 'hovertemplate', 'box', 'meanline', 'side', 'offsetgroup', 'alignmentgroup', 'selected', 'unselected', 'hoveron', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'metasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'hovertextsrc', 'hovertemplatesrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
 
2: 'violin' objects don't have these attributes: 'mode'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'uid', 'ids', 'customdata', 'meta', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'uirevision', 'y', 'x', 'x0', 'y0', 'name', 'orientation', 'bandwidth', 'scalegroup', 'scalemode', 'spanmode', 'span', 'line', 'fillcolor', 'points', 'jitter', 'pointpos', 'width', 'marker', 'text', 'hovertext', 'hovertemplate', 'box', 'meanline', 'side', 'offsetgroup', 'alignmentgroup', 'selected', 'unselected', 'hoveron', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'metasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'hovertextsrc', 'hovertemplatesrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
 
3: 'violin' objects don't have these attributes: 'mode'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'uid', 'ids', 'customdata', 'meta', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'uirevision', 'y', 'x', 'x0', 'y0', 'name', 'orientation', 'bandwidth', 'scalegroup', 'scalemode', 'spanmode', 'span', 'line', 'fillcolor', 'points', 'jitter', 'pointpos', 'width', 'marker', 'text', 'hovertext', 'hovertemplate', 'box', 'meanline', 'side', 'offsetgroup', 'alignmentgroup', 'selected', 'unselected', 'hoveron', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'metasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'hovertextsrc', 'hovertemplatesrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'

And this undesired plot: enter image description here

Any idea how to add the trend lines for all groups and optimally also their standard error lines?

Upvotes: 1

Views: 615

Answers (1)

MarBlo
MarBlo

Reputation: 4514

Maybe you can use this solution. It starts by making a new df with an additional variable describing the group and the subgroup.

Another DF builds the mean values of group_by(group, moreA).

The plot is made with ggplot and the geom_violin is filled with data from the first DF and geom_point from the second DF sum_res as well as the geom_smooth does. This geom_smooth takes the mean values from sum_res to make the fit.

After this the ggplot object is brought into ggplotly.

Unfortunately the ggplotly behaviour like hovering does not come out in this reprex. But it is working in RStudio


library(plotly)
library(tidyverse)


ddf <- df %>% #glimpse()
  mutate(moreA = case_when(
    age == 3 ~ paste0(group,'3'),
    age == 8 ~ paste0(group,'8'),
    age == 24 ~ paste0(group,'24'),
  )) 

# this will bring the right order
ddf$moreA <- factor(ddf$moreA, levels = unique(ddf$moreA))

sum_res <- ddf %>% #
  group_by(group, moreA) %>% 
  summarise(meanA = mean(value))
    
p <- ggplot(ddf, aes(x = moreA, y = value, color = moreA)) +
  geom_violin() +
  geom_point(data = sum_res, mapping = aes(x = moreA, y = meanA), size = 1) +
  geom_smooth(data = sum_res, 
              mapping = aes(x = moreA, y = meanA, group = group, color = group),
              method='lm', size = 1, se =F) +
  theme_bw() +
  theme(legend.position = 'none')
#p

ggplotly(p)

Upvotes: 2

Related Questions