Cosmoman
Cosmoman

Reputation: 101

Finding the slope trend from best fit lines

I am trying to figure out how to determine the slope trend from best fit lines that have points. Basically, once I have the trend in the slope, I want to plot multiple other lines with that trend in the same plot. For example: enter image description here

This plot is basically what I want to do, but I am not sure how to do it. As you can see, it has several best fit lines with points that have slopes and intersect at x = 6. After those lines, it has several lines that are based on the trend from the other slopes. I am assuming that using this code I can do something similar, but I am unsure how to manipulate the code to do what I want.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# simulate some artificial data
# =====================================
df = pd.DataFrame( { 'Age' : np.random.rand(25) * 160 } )

df['Length'] = df['Age'] * 0.88 + np.random.rand(25) * 5000

# plot those data points
# ==============================
fig, ax = plt.subplots()
ax.scatter(df['Length'], df['Age'])

# Now add on a line with a fixed slope of 0.03
slope = 0.03

# A line with a fixed slope can intercept the axis
# anywhere so we're going to have it go through 0,0
x_0 = 0
y_0 = 0

# And we'll have the line stop at x = 5000
x_1 = 5000
y_1 = slope (x_1 - x_0) + y_0

# Draw these two points with big triangles to make it clear
# where they lie
ax.scatter([x_0, x_1], [y_0, y_1], marker='^', s=150, c='r')

# And now connect them
ax.plot([x_0, x_1], [y_0, y_1], c='r')    

plt.show()

Upvotes: 2

Views: 2637

Answers (2)

yifan
yifan

Reputation: 72

I just modified your code a little bit over here. Basically what you need is a piecewise function. Under a certain value you have different slopes but all end up with 3000, after that the slop is just 0.

The plot is as follows:

enter image description here

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# simulate some artificial data
# =====================================
df = pd.DataFrame( { 'Age' : np.random.rand(25) * 160 } )

df['Length'] = df['Age'] * 0.88 + np.random.rand(25) * 5000

# plot those data points
# ==============================
fig, ax = plt.subplots()
ax.scatter(df['Length'], df['Age'])

# Now add on a line with a fixed slope of 0.03
#slope1 = -0.03
slope1 = np.arange(-0.05, 0, 0.01)
slope2 = 0

# A line with a fixed slope can intercept the axis
# anywhere so we're going to have it go through 0,0
x_0 = 0
y_1 = 0

# And we'll have the line stop at x = 5000
for slope in slope1:
    x_1 = 3000
    y_0 = y_1 - slope * (x_1 - x_0)
    ax.plot([x_0, x_1], [y_0, y_1], c='r')

x_2 = 5000
y_2 = slope2 * (x_2 - x_1) + y_1

# Draw these two points with big triangles to make it clear
# where they lie
ax.scatter([x_0, x_1], [y_0, y_1], marker='^', s=150, c='r')

# And now connect them
ax.plot([x_1, x_2], [y_1, y_2], c='r')    

plt.show()

Upvotes: 1

DavidG
DavidG

Reputation: 25400

The value y_1 can be found by using the equation of a straight line given by your slope and y_0:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'Age': np.random.rand(25) * 160})
df['Length'] = df['Age'] * 0.88 + np.random.rand(25) * 5000

fig, ax = plt.subplots()
ax.scatter(df['Length'], df['Age'])

slope = 0.03
x_0 = 0
y_0 = 0
x_1 = 5000
y_1 = (slope * x_1) + y_0  # equation of a straight line: y = mx + c

ax.plot([x_0, x_1], [y_0, y_1], marker='^', markersize=10, c='r')

plt.show()

Which produces the following graph:

enter image description here

In order to plot multiple lines, first create an array/list of gradients that will be used and then follow the same steps:

df = pd.DataFrame({'Age': np.random.rand(25) * 160})
df['Length'] = df['Age'] * 0.88 + np.random.rand(25) * 5000

fig, ax = plt.subplots()
ax.scatter(df['Length'], df['Age'])

slope = 0.03
x_0 = 0
y_0 = 0
x_1 = 5000

slopes = np.linspace(0.01, 0.05, 5)  # create an array containing the gradients

new_y = (slopes * x_1) + y_0  # find the corresponding y values at x = 5000

for i in range(len(slopes)):
    ax.plot([x_0, x_1], [y_0, new_y[i]], marker='^', markersize=10, label=slopes[i])

plt.legend(title="Gradients")
plt.show()

This produces the following figure:

enter image description here

Upvotes: 4

Related Questions