I applied SVM (scikit-learn) in some dataset and wanted to find the values of C and gamma that can give the best accuracy for the test set.
I first fixed C to a some integer and then iterate over many values of gamma until I got the gamma which gave me the best test set accuracy for that C. And then I fixed this gamma which i got in the above step and iterate over values of C and find a C which can give me best accuracy and so on ...
But the above steps can never give the best combination of gamma and C that produce best test set accuracy.
Can anyone help me in finding a way out to get this combo (gamma,C) in sckit-learn ?
You are looking for Hyper-Parameter tuning. In parameter tuning we pass a dictionary containing a list of possible values for you classifier, then depending on the method that you choose (i.e. GridSearchCV, RandomSearch, etc.) the best possible parameters are returned. You can read more about it here.
As example :
#Create a dictionary of possible parameters
params_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100],
'gamma': [0.0001, 0.001, 0.01, 0.1],
'kernel':['linear','rbf'] }
#Create the GridSearchCV object
grid_clf = GridSearchCV(SVC(class_weight='balanced'), params_grid)
#Fit the data with the best possible parameters
grid_clf = clf.fit(X_train, y_train)
#Print the best estimator with it's parameters
print grid_clf.best_estimators
You can read more about GridSearchCV here and RandomizedSearchCV here. A word of caution though, SVM takes a lot of CPU power so be careful with the number of parameters you pass. It might take some time to process depending upon your data and the number of parameters you pass.
This link also contains an example as well