Sklearn svr give wrong results when the training data obvious show a pattern

Question

I have the following training data:

x = [
    [0.914728682,5.217,5,0.217,3.150362319,33.36,35,-1.64,4.220113852],
    [0.885057471,7.793,8,-0.207,3.380911063,46.84,48,-1.16,4.448243115],
    [0.871345029,7.152,7,0.152,3.976205037,44.98,47,-2.02,5.421236592],
    [0.821428571,8.04,8,0.04,2.909880565,52.02,54.5,-2.48,2.824104235],
    [0.931372549,8.01,8,0.01,4.616714697,48.04,48,0.04,9.650462033],
    [0.66367713,5.424,5.5,-0.076,1.37804878,32.6,35.5,-2.9,1.189781022],
    [0.78,8.66,9,-0.34,2.272965879,48.47,55,-6.53,2.564550265],
    [0.227272727,19.55,21,-1.45,1.860133206,128.23,147,-18.77,1.896893491],
    [0.47826087,10.09,8,2.09,1.155519927,74.43,64,10.43,1.169547454],
    [0.652694611,6.775,4,2.775,1.05529595,43.1,30,13.1,1.062885327],
    [0.798561151,3.986,2,1.986,0.656563993,25.38,13,12.38,0.652442159],
    [0.666666667,5.419,3,2.419,1.057985162,34.37,16,18.37,0.981719509],
    [0.5625,7.719,2,5.719,0.6421797,46.91,12,34.91,0.665673336]
]

and the following labels(scores):

y = [0.237113402,0.168831169,0.104166667,0.086419753,0.063147368,0.016042781,
     0.014814815,0,0,-0.0794,-0.14,-0.1832,-0.2385]

It seems clear that the larger the values in column 5 and column 9 are, the higher the scores.

I write the following code that make use of SVR on the training data provided:

rb = RobustScaler()
xScaled = rb.fit_transform(x)
model = SVR(C=1.0, epsilon=0.1)
model.fit(xScaled,y)

But no matter which of the following I use for prediction, it is not giving a score that looks right.

1 score = model.predict(rb.fit_transform(testData))

2 score = model.predict(testData)

If I do something like the following during training:

 xScaled = preprocessing.scale(x)
 model = SVR(C=1.0, epsilon=0.1)
 model.fit(xScaled,y)

then:
score = svmModel.predict(testData)

I get back something close to the origin y.

But I pick a row in x, put it in a 2d array with one row called testData, and do:

score = svmModel.predict(testData)

I get a wrong score. In fact, no matter which row in x I use for creating the 2d array with one row, I get the same score.

What have I done wrong? I would be extremely grateful if someone can help.

Sklearn svr give wrong results when the training data obvious show a pattern

1 score = model.predict(rb.fit_transform(testData))

2 score = model.predict(testData)

Answers (1)

Related Questions