Tanay Rastogi
Tanay Rastogi

Reputation: 81

Tuning parameters for SVM Regression

I am trying to create a SV Regression. I am generating the data from sinc function with some Gaussian noise.

Now, in oder to find the best parameters to for RBF kernel, I am using GridSearchCV by running 5-fold cross validation.

P.S - I am new to python and machine learning, so maybe code is not very optimised or correct in some way.

My code:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVR
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_squared_error


def generateData(N, sigmaT):  
    # Input datapoints 
    data = np.reshape(np.linspace(-10, 10, N), (N,1))
    # Noise in target with zero mean and variance sigmaT
    epi = np.random.normal(0 , sigmaT, N)

    # Target
    t1 = np.sinc(data).ravel()              # target without noise
    t2 = np.sinc(data).ravel() + epi        # target with noise
    t1 = np.reshape(t1, (N, 1))
    t2 = np.reshape(t2, (N, 1))

    # Plot the generated data
    plt.plot(data, t1, '--r', label = 'Original Curve')
    plt.scatter(data, t2, c = 'orange', label = 'Data')
    plt.title("Generated data")

    return data, t2, t1


# Generate data from sin funtion
N = 100                         # Number of data points
sigmaT = 0.1                    # Noise in the data 
plt.figure(1)
X, y, true = generateData(N, sigmaT)
y = y.ravel()

# Tuning of parameters for regression by cross-validation
K = 5               # Number of cross valiations

# Parameters for tuning
parameters = [{'kernel': ['rbf'], 'gamma': [1e-4, 1e-3, 0.01, 0.1, 0.2, 0.5, 0.6, 0.9],'C': [1, 10, 100, 1000, 10000]}]
print("Tuning hyper-parameters")
svr = GridSearchCV(SVR(epsilon = 0.01), parameters, cv = K)
svr.fit(X, y)

# Checking the score for all parameters
print("Grid scores on training set:")
means = svr.cv_results_['mean_test_score']
stds = svr.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, svr.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"% (mean, std * 2, params))

And the result is

Best parameters set found on development set:  {'gamma': 0.0001, 'kernel': 'rbf', 'C': 1}
Grid scores on training set:
-0.240 (+/-0.366) for {'gamma': 0.0001, 'kernel': 'rbf', 'C': 1}
-0.535 (+/-1.076) for {'gamma': 0.001, 'kernel': 'rbf', 'C': 1}
-0.863 (+/-1.379) for {'gamma': 0.01, 'kernel': 'rbf', 'C': 1}
-3.057 (+/-4.954) for {'gamma': 0.1, 'kernel': 'rbf', 'C': 1}
-1.576 (+/-3.185) for {'gamma': 0.2, 'kernel': 'rbf', 'C': 1}
-0.439 (+/-0.048) for {'gamma': 0.5, 'kernel': 'rbf', 'C': 1}
-0.417 (+/-0.110) for {'gamma': 0.6, 'kernel': 'rbf', 'C': 1}
-0.370 (+/-0.248) for {'gamma': 0.9, 'kernel': 'rbf', 'C': 1}
-0.514 (+/-0.724) for {'gamma': 0.0001, 'kernel': 'rbf', 'C': 10}
-1.308 (+/-3.002) for {'gamma': 0.001, 'kernel': 'rbf', 'C': 10}
-4.717 (+/-10.886) for {'gamma': 0.01, 'kernel': 'rbf', 'C': 10}
-14.247 (+/-27.218) for {'gamma': 0.1, 'kernel': 'rbf', 'C': 10}
-15.241 (+/-19.086) for {'gamma': 0.2, 'kernel': 'rbf', 'C': 10}
-0.533 (+/-0.571) for {'gamma': 0.5, 'kernel': 'rbf', 'C': 10}
-0.566 (+/-0.527) for {'gamma': 0.6, 'kernel': 'rbf', 'C': 10}
-1.087 (+/-1.828) for {'gamma': 0.9, 'kernel': 'rbf', 'C': 10}
-0.591 (+/-1.218) for {'gamma': 0.0001, 'kernel': 'rbf', 'C': 100}
-2.111 (+/-2.940) for {'gamma': 0.001, 'kernel': 'rbf', 'C': 100}
-19.591 (+/-29.731) for {'gamma': 0.01, 'kernel': 'rbf', 'C': 100}
-96.461 (+/-96.744) for {'gamma': 0.1, 'kernel': 'rbf', 'C': 100}
-14.430 (+/-10.858) for {'gamma': 0.2, 'kernel': 'rbf', 'C': 100}
-14.742 (+/-37.705) for {'gamma': 0.5, 'kernel': 'rbf', 'C': 100}
-7.915 (+/-10.308) for {'gamma': 0.6, 'kernel': 'rbf', 'C': 100}
-1.592 (+/-1.513) for {'gamma': 0.9, 'kernel': 'rbf', 'C': 100}
-1.543 (+/-3.654) for {'gamma': 0.0001, 'kernel': 'rbf', 'C': 1000}
-4.629 (+/-10.477) for {'gamma': 0.001, 'kernel': 'rbf', 'C': 1000}
-65.690 (+/-92.825) for {'gamma': 0.01, 'kernel': 'rbf', 'C': 1000}
-2745.336 (+/-4173.978) for {'gamma': 0.1, 'kernel': 'rbf', 'C': 1000}
-248.269 (+/-312.776) for {'gamma': 0.2, 'kernel': 'rbf', 'C': 1000}
-65.826 (+/-132.946) for {'gamma': 0.5, 'kernel': 'rbf', 'C': 1000}
-28.569 (+/-64.979) for {'gamma': 0.6, 'kernel': 'rbf', 'C': 1000}
-6.955 (+/-8.647) for {'gamma': 0.9, 'kernel': 'rbf', 'C': 1000}
-3.647 (+/-7.858) for {'gamma': 0.0001, 'kernel': 'rbf', 'C': 10000}
-12.712 (+/-29.380) for {'gamma': 0.001, 'kernel': 'rbf', 'C': 10000}
-1094.270 (+/-2262.303) for {'gamma': 0.01, 'kernel': 'rbf', 'C': 10000}
-3698.268 (+/-8085.389) for {'gamma': 0.1, 'kernel': 'rbf', 'C': 10000}
-2079.620 (+/-3651.872) for {'gamma': 0.2, 'kernel': 'rbf', 'C': 10000}
-70.982 (+/-159.707) for {'gamma': 0.5, 'kernel': 'rbf', 'C': 10000}
-89.859 (+/-180.071) for {'gamma': 0.6, 'kernel': 'rbf', 'C': 10000}
-661.291 (+/-1636.522) for {'gamma': 0.9, 'kernel': 'rbf', 'C': 10000}

Now the GridSearchCV gives me best parameters as C:1, gamma:0.0001 but I checked that the parameters should be C:1000, gamma:0.5

Now my question is

Edit: I am also adding the code on how I found the correct parameters. I just tried to put all the parameters in the SVR and mean square error.

# Working parameters
svr = SVR(kernel='rbf', C=1e3, gamma = 0.5, epsilon = 0.01)
y_rbf = svr.fit(X, y).predict(X)

# Plotting
plt.figure(1)
plt.plot(X, y_rbf, c = 'navy', label = 'Predicted')
plt.legend()

# Checking prediction error
print("Mean squared error: %.2f" % mean_squared_error(true, y_rbf))

The plot on the above parameters is in the link, https://i.sstatic.net/8TH27.jpg

The plot from the GridSearchCV choosen parame https://i.sstatic.net/rv3Sb.jpg

Upvotes: 4

Views: 22894

Answers (1)

Vivek Kumar
Vivek Kumar

Reputation: 36599

A couple of things play an important part here:

1) Scoring criteria used by GridSearch to find best params. Since you have not provided any value to scoring param of GridSearchCV, the scoring method of the SVR will be used which is R-squared value and not mean_squared_error as you have done.

That can be fixed by doing this:

from sklearn.metrics import make_scorer
scorer = make_scorer(mean_squared_error, greater_is_better=False)
svr_gs = GridSearchCV(SVR(epsilon = 0.01), parameters, cv = K, scoring=scorer)

2) The amount of data used by the GridSearch for training. The grid-search will split the data into train and test using the cv provided (in your case K=5, so a 5 fold approach will be used). This means that grid-search will train the SVR on train data and calculate the score on test data, not on whole data as you are doing. This will lead to changes in answer. For K=5, at a single time, only 80% of data is used in training. That means less data than what you are doing.

That can be fixed by increasing the value of K to maybe 15 or 20 or 25.

After doing these two changes, this is what I get:

GridSearch Results

Upvotes: 5

Related Questions