Raj
Raj

Reputation: 59

Time series forecasting with scikit learn

I am trying to set-up a python code for forecasting a time-series, using the SVM model of scikit-learn.

My data contains X values at 30 minute interval for the last 24 hours, and I need to predict y for the next timestamp. Here's what I have set up -

SVR(kernel='linear', C=1e3).fit(X, y).predict(X)

But for this prediction to work, I need the X value for the next timestamp, which is not available. How do I set this up to predict future y values?

Upvotes: 5

Views: 13999

Answers (1)

farhawa
farhawa

Reputation: 10388

You should use SVR this way:

# prepare model and set parameters
svr_model = SVR(kernel='linear', C=1e3)
# fit your model with the training set
svr_model.fit(TRAINIG_SET, TAINING_LABEL)
#predict on a test set
svr_model.predict(TEST_SET)

So, the problem here is that you have a training set but not a test set to measure your model accuracy. The only solution is to use a part of your training set as test set, e.g. 80% for train and 20% for test.

EDIT (after comments)

So you want to predict the next label for the last hour in your train set, here is an example of what you want:

from sklearn.svm import SVR
import random
import numpy as np

'''
data: the train set, 24 elements
label: label for each time
'''

data = [10+y for  y in [x * .5 for x in range(24)]]
label =  [z for z in [random.random()]*24]

# reshaping the train set and the label ...

DATA = np.array([data]).T
LABEL = np.array(label)

# Declaring model and fitting it

clf  = SVR(kernel='linear', C=1e3)
clf.fit(DATA, LABEL)

# predict the next label 

to_predict = DATA[DATA[23,0]+0.5]

print clf.predict(to_predict)

>> 0.94407674

Upvotes: 4

Related Questions