I am a complete newbie to SVM-based forecasting and so looking for some guidance here. I am trying to set-up a python code for forecasting a time-series, using SVM libraries of scikit-learn.
My data contains X values at 30 minute interval for the last 24 hours, and I need to predict y for the next timestamp. Here's what I have set up -
SVR(kernel='linear', C=1e3).fit(X, y).predict(X)
But for this prediction to work, I need the X value for the next timestamp, which is not available. How do I set this up to predict future y values?
You should use SVR
this way:
# prepare model and set parameters
svr_model = SVR(kernel='linear', C=1e3)
# fit your model with the training set
svr_model.fit(TRAINIG_SET, TAINING_LABEL)
#predict on a test set
svr_model.predict(TEST_SET)
So, the problem here is that you have a training set but not a test set to measure your model accuracy. The only solution is to use a part of your training set as test set ex: 80% for train 20% for test
EDIT
Hope I well understood what you want from your comments.
So you want to predict the next label for the last hour in your train set, here is an example of what you want:
from sklearn.svm import SVR
import random
import numpy as np
'''
data: the train set, 24 elements
label: label for each time
'''
data = [10+y for y in [x * .5 for x in range(24)]]
label = [z for z in [random.random()]*24]
# reshaping the train set and the label ...
DATA = np.array([data]).T
LABEL = np.array(label)
# Declaring model and fitting it
clf = SVR(kernel='linear', C=1e3)
clf.fit(DATA, LABEL)
# predict the next label
to_predict = DATA[DATA[23,0]+0.5]
print clf.predict(to_predict)
>> 0.94407674