Rob
Rob

Reputation: 1426

Regression line for holoviews scatter plot?

I'm creating a scatter plot from an xarray Dataset using

scat = ds.hvplot.scatter(x='a', y='b', groupby='c', height=900, width=900)

How can I add a regression line to this plot?

I'm also using this to set some of the properties in the plot and I could add the Slope within the hook function but I can't figure out how to access x and y from the plot.state. This also might be completely the wrong way of doing it.

scat = scat.opts(hooks=[hook])

def hook(plot, element):
    print('plot.state:   ', plot.state)
    print('plot.handles: ', sorted(plot.handles.keys()))

    par = np.polyfit(x, y, 1, full=True)
    gradient=par[0][0]
    y_intercept=par[0][1]

    slope = Slope(gradient=gradient, y_intercept=y_intercept,
          line_color='orange', line_dash='dashed', line_width=3.5)

    plot.state.add_layout(slope)

scat = scat.opts(hooks=[hook])

Upvotes: 6

Views: 2322

Answers (2)

Sander van den Oord
Sander van den Oord

Reputation: 12838

HoloViews >= 1.13 now has support for adding a regression line to your plot, so you don't need hooks anymore.

1) You can either add the regression line yourself by specifying keywords slope and y_intercept:

gradient = 2
y_intercept = 15

# create random data
xpts = np.arange(0, 20)
ypts = gradient * xpts + y_intercept + np.random.normal(0, 4, 20)

scatter = hv.Scatter((xpts, ypts))

# create slope with hv.Slope()
slope = hv.Slope(gradient, y_intercept)

scatter.opts(size=10) * slope.opts(color='red', line_width=6)



2) Or you can have HoloViews calculate it for you with hv.Slope.from_scatter():

normal = hv.Scatter(np.random.randn(20, 2))

normal.opts(size=10) * hv.Slope.from_scatter(normal)



Resulting plot:

scatter plot with regression line holoviews 1.13

Upvotes: 8

philippjfr
philippjfr

Reputation: 4080

The plot hooks is given two arguments, the second of which is the element being displayed. Since the element contains the data being displayed we can write a callback to compute the slope using the dimension_values method to get the values of the 'a' and 'b' dimensions in your data. Additionally, in order to avoid the Slope glyph being added multiple times, we can cache it on the plot and update its attributes:

def hook(plot, element):
    x, y = element.dimension_values('a'), element.dimension_values('b')
    par = np.polyfit(x, y, 1, full=True)
    gradient=par[0][0]
    y_intercept=par[0][1]

    if 'slope' in plot.handles:
        slope = plot.handles['slope']
        slope.gradient = gradient
        slope.y_intercept = y_intercept
    else:

        slope = Slope(gradient=gradient, y_intercept=y_intercept,
              line_color='orange', line_dash='dashed', line_width=3.5)
        plot.handles['slope'] = slope
        plot.state.add_layout(slope)

Upvotes: 3

Related Questions