Reputation: 11
Original Predicted
0 6 1.56
1 12.2 3.07
2 0.8 2.78
3 5.2 3.54
.
Code that I have tried:
def plotGraph(y_test,y_pred,regressorName):
if max(y_test) >= max(y_pred):
my_range = int(max(y_test))
else:
my_range = int(max(y_pred))
plt.scatter(y_test, y_pred, color='red')
plt.plot(range(my_range), range(my_range), 'o')
plt.title(regressorName)
plt.show()
return
But my current output:
Upvotes: 0
Views: 32379
Reputation: 49
You are plotting the y_test
on x axis and y_pred
on y axis.
And what you want to have is a common data point on x axis and y_test
and y_pred
both on Y axis.
And following snippet below will help you achieve that.
(where true_value and predicted_value are your lists to be plotted and common is your list from your dataframe used as common x axis.)
fig = plt.figure()
a1 = fig.add_axes([0,0,1,1])
x = common
a1.plot(x,true_value, 'ro')
a1.set_ylabel('Actual')
a2 = a1.twinx()
a2.plot(x, predicted_value,'o')
a2.set_ylabel('Predicted')
fig.legend(labels = ('Actual','Predicted'),loc='upper left')
plt.show()
Upvotes: 1
Reputation: 2830
The problem you seem to have is that you mix y_test
and y_pred
into one "plot" (meaning here the scatter()
function)
Using scatter()
or plot()
function (which you also mixed up), the first parameter are the coordinates on the x-axis and the second parameter are the coordinates on the y-axis.
So 1.) you need to one scatter()
with only y_test
and then one with only y_pred
. To do this you 2.) need either to have 2D data, or as it seems to be in your case, just use indexes for the x-axis by using the range()
functionality.
Here is some code with random data, that might get you started:
import matplotlib.pyplot as plt
import numpy as np
def plotGraph(y_test,y_pred,regressorName):
if max(y_test) >= max(y_pred):
my_range = int(max(y_test))
else:
my_range = int(max(y_pred))
plt.scatter(range(len(y_test)), y_test, color='blue')
plt.scatter(range(len(y_pred)), y_pred, color='red')
plt.title(regressorName)
plt.show()
return
y_test = range(10)
y_pred = np.random.randint(0, 10, 10)
plotGraph(y_test, y_pred, "test")
This will give you something like this:
Upvotes: 3
Reputation: 95
In matplotlib (from the code I assume you're using it) documentation there is an information for matplotlib.pyplot.scatter
function, first two parameters are:
x, y : float or array-like, shape (n, )
The data positions.
So for your application you need to draw two scatterplots on the same graph - using matplotlib.pyplot.scatter
twice. First with y_test
as y
and color='red'
, second with y_pred
as y
and color='blue'
Unfortunately, you didn't provide information what you x values for y_test and y_pred, but you will need those as well to define x
in your plt.scatter
function calls.
Drawing two scatter plots is slightly tricky, and as this answer says it requires a reference to an Axes
object. For example (as given in the answer):
import matplotlib.pyplot as plt
x = range(100)
y = range(100,200)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.scatter(x[:4], y[:4], s=10, c='b', marker="s", label='first')
ax1.scatter(x[40:],y[40:], s=10, c='r', marker="o", label='second')
Take a look at the matplotlib documentation and the mentioned answer for more details.
Upvotes: 0
Reputation: 313
I can not simulate your code, but I have seen some points at the first glance. First of all, data points in the graph you wanted are normalized. You need to divide the all data points in col by the maximum value of this col.
You should also check the legend function in the documentation to add legend like the graph you wanted.
Upvotes: 0