Scikit-learn grid search with SVM regression

Eka picture Eka · Mar 30, 2016 · Viewed 8k times · Source

I am learning cross validation-grid search and came across this youtube playlist and the tutorial also has been uploaded to the github as an ipython notebook. I am trying to recreate the codes in the Searching multiple parameters simultaneously section but instead of using knn i am using SVM Regression. This is my code

from sklearn.datasets import load_iris
from sklearn import svm
from sklearn.grid_search import GridSearchCV
import matplotlib.pyplot as plt
import numpy as np
iris = load_iris()
X = iris.data
y = iris.target

k=['rbf', 'linear','poly','sigmoid','precomputed']
c= range(1,100)
g=np.arange(1e-4,1e-2,0.0001)
g=g.tolist()
param_grid=dict(kernel=k, C=c, gamma=g)
print param_grid
svr=svm.SVC()
grid = GridSearchCV(svr, param_grid, cv=5,scoring='accuracy')
grid.fit(X, y)  
print()
print("Grid scores on development set:")
print()  
print grid.grid_scores_  
print("Best parameters set found on development set:")
print()
print(grid.best_params_)
print("Grid best score:")
print()
print (grid.best_score_)
# create a list of the mean scores only
grid_mean_scores = [result.mean_validation_score for result in grid.grid_scores_]
print grid_mean_scores

But its giving this error

raise ValueError("X should be a square kernel matrix") ValueError: X should be a square kernel matrix

Answer

ogrisel picture ogrisel · Mar 30, 2016

Remove 'precomputed' from your parameter space.

kernel='precomputed'can only be used when passing a (n_samples, n_samples) data matrix that represents pairwise similarities for the samples instead of the traditional (n_samples, n_features) rectangular data matrix.

See the documentation for more details on the meaning of the kernel parameter: