user1728853
user1728853

Reputation: 2707

Line fitting below points

I have a set of x, y points and I'd like to find the line of best fit such that the line is below all points using SciPy. I'm trying to use leastsq for this, but I'm unsure how to adjust the line to be below all points instead of the line of best fit. The coefficients for the line of best fit can be produced via:

def linreg(x, y):

    fit = lambda params, x: params[0] * x - params[1]
    err = lambda p, x, y: (y - fit(p, x))**2 

    # initial slope/intercept
    init_p = np.array((1, 0))

    p, _ = leastsq(err, init_p.copy(), args=(x, y))

    return p

xs = sp.array([1, 2, 3, 4, 5])  
ys = sp.array([10, 20, 30, 40, 50])

print linreg(xs, ys)

The output is the coefficients for the line of best fit:

array([  9.99999997e+00,  -1.68071668e-15])

How can I get the coefficients of the line of best fit that is below all points?

Upvotes: 5

Views: 1947

Answers (1)

Jaime
Jaime

Reputation: 67447

A possible algorithm is as follows:

  1. Move the axes to have all the data on the positive half of the x axis.

  2. If the fit is of the form y = a * x + b, then for a given b the best fit for a will be the minimum of the slopes joining the point (0, b) with each of the (x, y) points.

  3. You can then calculate a fit error, which is a function of only b, and use scipy.optimize.minimize to find the best value for b.

  4. All that's left is computing a for that b and calculating b for the original position of the axes.

The following does that most of the time, except when the minimization fails with some mysterious error:

from __future__ import division
import numpy as np
import scipy.optimize
import matplotlib.pyplot as plt

def fit_below(x, y) :
    idx = np.argsort(x)
    x = x[idx]
    y = y[idx]
    x0, y0 = x[0] - 1, y[0]
    x -= x0
    y -= y0

    def error_function_2(b, x, y) :
        a = np.min((y - b) / x)
        return np.sum((y - a * x - b)**2)

    b = scipy.optimize.minimize(error_function_2, [0], args=(x, y)).x[0]

    a = np.min((y - b) / x)

    return a, b - a * x0 + y0

x = np.arange(10).astype(float)
y = x * 2 + 3 + 3 * np.random.rand(len(x))

a, b = fit_below(x, y)

plt.plot(x, y, 'o')
plt.plot(x, a*x + b, '-')
plt.show()

And as TheodrosZelleke wisely predicted, it goes through two points that are part of the convex hull:

enter image description here

Upvotes: 6

Related Questions