Reputation: 55
I have recently built an SVR model on the diamonds dataset to predict the price of a diamond, based on some specific features. I was trying to plot the test features of my model against the predicted price. Below is an explanation of the variables used in the code.
X_test - I carried out a train/test split and these are the features used to test the model on. Size 10782,7 (8 total features).
y_pred - After running the model, this will be the predicted price for each row of features in the dataset. Size 10782.
Below is the code for how these come into play
diamonds_features = ['carat', 'x', 'y', 'z', 'color', 'cut', 'clarity']
X = df.loc[:, diamonds_features].values
y = df.iloc[:, 6:7].values
X_train, X_test, y_train, y_test = train_test_split(X, y.ravel(), test_size=0.20)
regressor = SVR(kernel='rbf', C=50, gamma = 10)
regressor.fit(X_train, y_train)
#produce test predictions
y_pred = regressor.predict(X_test)
Below is the code for plotting the outcome of the model.
colorGroup = ['b','g','r','c','m','y','k','w']
plt.figure(1)
for i in range(len(X_test)):
col = colorGroup[i % 8]
for j in range(8):
plt.scatter(X_test[i:i+1, j:j+1], y_pred[i:i+1], color=col)
To go around the fact that X_test and y_pred are of different sizes, I wanted to do the following:
For each individual value in y_pred (since it is a 1d array, it will be every value), Take every value in one row of X_test and plot it against the y_pred value. Moreover, use mod to ensure that every feature is coloured accordingly (e.g. when I am plotting carat, it will be a consistent colour throughout the plot).
The issue I get with this code is that I get the following: "ValueError: x and y must be the same size"
If anyone could point out where I am going wrong with this, I would be grateful.
Here is the Traceback I am receiving:
Traceback (most recent call last):
File "C:\Users\mypackage\SVM Model.py", line 72, in plt.scatter(X_test[i:i+1, j:j+1], y_pred[i:i+1], color=colorGroup[i%8])
File "C:\Users\anaconda3\lib\site-packages\matplotlib\pyplot.py", line 2890, in scatter __ret = gca().scatter(
File "C:\Users\anaconda3\lib\site-packages\matplotlib_init_.py", line 1438, in inner return func(ax, *map(sanitize_sequence, args), **kwargs)
File "C:\Users\anaconda3\lib\site-packages\matplotlib\cbook\deprecation.py", line 411, in wrapper return func(*inner_args, **inner_kwargs)
File "C:\Users\anaconda3\lib\site-packages\matplotlib\axes_axes.py", line 4441, in scatter raise ValueError("x and y must be the same size")
ValueError: x and y must be the same size
Edit: updated the question with the Traceback
Upvotes: 0
Views: 206
Reputation: 5735
According to the comment I believe this is what you want.
As an example, I used 20 points with 3 features.
import numpy as np
import matplotlib.pyplot as plt
X_test = np.random.rand(20, 3)
y_pred = np.random.rand(20)
N = y_pred.size
colorGroup = ['b','g','r','c','m','y','k','w']
plt.figure(1)
for i in range(N):
col = colorGroup[i % N]
plt.scatter(X_test[:, i], y_pred, color=col)
Upvotes: 1