Jonas
Jonas

Reputation: 67

np.polyfit plot with uncertainty on the y in python

I have the following dataframe:

x      y           error_on_y
1      1.2         0.1
2      0.87        0.23
4      1.12        0.11
5      0.75        0.06
5      0.66        0.15
6      0.98        0.08
7      1.34        0.05
7      2.86        0.12

With this frame I want to use a np.polyfit to fit the regression line. I've fitted the line using:

x = np.array(dataframe['x'])
y = np.array(dataframe['y'])

y_err = np.array(dataframe['error_on_y'])

np.polyfit(x,y,deg=1, w=1/y_err, cov=True)

However I can't find how to plot this fit with my errors defined in python. So far the only examples of plotting with np.polyfit I found were with fits that did not involve a specified weight.

Does anyone know how I would be able to plot this line? Or does anyone know a link to a descent example? I for one have not been able to find one and I have been looking for quite a while now so any expertise on the matter would be very welcome and much appreciated!

EDIT/Clarification:

when the weight(w) is not defined in the function the polyfit function will return a single vector with the coefficients that minimise the squared error. However when the w is defined another vector is also added:

np.polyfit(x,y,deg=1, w=1/y_err, cov=True)

output:

(array([0.00097481, 0.82290694]), array([[ 4.75261249e-09, -2.28408710e-07],
    [-2.28408710e-07,  1.41696109e-05]]))

EDIT/additional info:

after finding this link(https://peteris.rocks/blog/extrapolate-lines-with-numpy-polyfit/) I found that with a non defined weight the polyfit function only returns the first array. i.e.

vector = array([0.00097481, 0.82290694])

in the line function y = mx + b then m =vector[0] and b = vector[1] . aka m = slope and b = the intercept. This means the additional vector in the example above must be the result of the defined weight in the function.

I am trying to find how I should interpretate/plot this with the weights included :)

POSSIBLE ANSWER: I've found the following:

import numpy as np
new = np.polyfit(x,y,deg=1, w=1/y_err, cov=True)
m, b = new[0]
a,c = new[1][0]
d,e = new[1][1]
m, b, a,c, d,e

for i in range(min(x), max(x)):
    plt.plot(i, i * m + b, 'go')
    plt.plot(i, i * (m+a) + (b+c), 'bo')
    plt.plot(i, i * (m-d) + (b-e), 'ro')

plt.show()

In this example I assumed that the first array/vector that is given in

(array([0.00097481, 0.82290694]), array([[ 4.75261249e-09, -2.28408710e-07],
        [-2.28408710e-07,  1.41696109e-05]]))

Are the coefficients for the fit regression line. The following 2 arrays would be the error on the regression line. This is what I tried and I believe makes sense. It is not definitive though so I'll leave the post open for comments and remarks/ better solutions.

Upvotes: 3

Views: 2363

Answers (1)

Yuca
Yuca

Reputation: 6091

When you do np.polyfit(x,y,deg=1, w=1/y_err, cov=True) you're calculating (among other things) the coefficients of a polynomial. To easily manipulate such coefficients you can create a polynomial object

p, mycov = np.poly1d(np.polyfit(x,y,deg=1, w=1/y_err, cov=True))

and plot it using

x_plot = np.linspace(1, 7, 100)
plt.plot(x_plot, p(x_plot))

Upvotes: 1

Related Questions