Reputation: 59
I am trying to set-up a python code for forecasting a time-series, using the SVM model of scikit-learn.
My data contains X values at 30 minute interval for the last 24 hours, and I need to predict y for the next timestamp. Here's what I have set up -
SVR(kernel='linear', C=1e3).fit(X, y).predict(X)
But for this prediction to work, I need the X value for the next timestamp, which is not available. How do I set this up to predict future y values?
Upvotes: 5
Views: 13999
Reputation: 10388
You should use SVR
this way:
# prepare model and set parameters
svr_model = SVR(kernel='linear', C=1e3)
# fit your model with the training set
svr_model.fit(TRAINIG_SET, TAINING_LABEL)
#predict on a test set
svr_model.predict(TEST_SET)
So, the problem here is that you have a training set but not a test set to measure your model accuracy. The only solution is to use a part of your training set as test set, e.g. 80% for train and 20% for test.
EDIT (after comments)
So you want to predict the next label for the last hour in your train set, here is an example of what you want:
from sklearn.svm import SVR
import random
import numpy as np
'''
data: the train set, 24 elements
label: label for each time
'''
data = [10+y for y in [x * .5 for x in range(24)]]
label = [z for z in [random.random()]*24]
# reshaping the train set and the label ...
DATA = np.array([data]).T
LABEL = np.array(label)
# Declaring model and fitting it
clf = SVR(kernel='linear', C=1e3)
clf.fit(DATA, LABEL)
# predict the next label
to_predict = DATA[DATA[23,0]+0.5]
print clf.predict(to_predict)
>> 0.94407674
Upvotes: 4