Bryan Downing
Bryan Downing

Reputation: 207

Extract y values from this trend line plot in Python

How does some get the y values from the red trend line generated from the source code below? I am no math expert.

As for the code below from How to add trendline in python matplotlib dot (scatter) graphs?

#random inputs for x and y


x = np.random.uniform(low=0.5, high=13.3, size=(50,))
y = np.random.uniform(low=0.5, high=13.3, size=(50,))

# plot the data itself
pylab.plot(x,y,'o')

# calc the trendline
z = numpy.polyfit(x, y, 1)
p = numpy.poly1d(z)
pylab.plot(x,p(x),"r--")
# the line equation:
print "y=%.6fx+(%.6f)"%(z[0],z[1])

When I print the value of p(x), the expected values of y to plot the red trend line.

[7.25072088 7.74580974 7.707636   7.57456601 7.72771792 7.36682509
 7.36216195 7.45937086 7.47592622 7.76663313 7.71256734 7.68601844
 7.34777885 7.2552914  7.28729136 7.4828444  7.25690455 7.47861776
 7.48472596 7.63791435 7.79364877 7.79382845 7.45020348 7.5488981
 7.29478413 7.27191799 7.47409563 7.26783249 7.49132469 7.2515923
 7.40558937 7.55062512 7.46004735 7.4094514  7.69985713 7.23891764
 7.50790404 7.38789488 7.23477781 7.59598148 7.49460819 7.62039958
 7.67580303 7.40553616 7.61933389 7.60038837 7.76048006 7.41307834
 7.28136679 7.5063726 ]

If this an upward moving trend, should the array elements increase from start to end? As you can see, there are element previous values higher than then current one. Should there not be a steady incline which where the next element would ALWAYS be the higher than the previous element? Call me confused.

Upvotes: 2

Views: 1207

Answers (1)

NPE
NPE

Reputation: 500673

Should there not be a steady incline which where the next element would ALWAYS be the higher than the previous element?

Yes, the fit is a straight line, so higher values of x are always associated with higher (or lower, depending on the slope) values of p(x).

What's happening in your case is that x is not sorted, and so p(x) isn't sorted either.

In [18]: x
Out[18]:
array([  9.95692606,   5.25372625,   9.84277793,   9.75691888,
         3.53691402,   7.47732635,  13.26638669,  10.39011192,
        11.86590794,  10.38592445,   0.5328471 ,   7.69932299,
        ...

As you can see, we're not starting on the left and moving to the right. We're first looking at some point in the middle, then jumping left a lot, then jumping right, then moving a little bit to the left, etc. The corresponding p(x) values are not going to be monotonic either.

If you sort the points from left to right, you'll see that they indeed always move in the same vertical direction:

In [20]: sorted(zip(x, p(x)))
Out[20]:
[(0.53284710066507301, 5.2982022878459842),
 (0.90494271648495472, 5.3490731826338447),
 (1.2383322417505211, 5.3946523906172272),
 (1.2542322226117251, 5.3968261497778585),
 (1.3243912128123114, 5.4064179064586044),
 (1.4506628234207115, 5.4236810763129437),
 (2.0368566039434102, 5.503822311163459),
 (2.8349103207704576, 5.6129278876274968),
 (3.0174136939304748, 5.637878759123244),
 (3.5369140229038196, 5.7089020269444219),
 (4.932863919562303, 5.8997487268324766),
 (4.943993127936622, 5.9012702518497351),
 (4.9500689452818589, 5.9021009046491208),
 ...

Upvotes: 1

Related Questions