I'm trying to find the best model Neural Network model applied for the classification of breast cancer samples on the well-known Wisconsin Cancer dataset (569 samples, 31 features + target). I'm using sklearn 0.18.1. I'm not using Normalization so far. I'll add it when I solve this question.
# some init code omitted
X_train, X_test, y_train, y_test = train_test_split(X, y)
Define params NN params for the GridSearchCV
tuned_params = [{'solver': ['sgd'], 'learning_rate': ['constant'], "learning_rate_init" : [0.001, 0.01, 0.05, 0.1]},
{"learning_rate_init" : [0.001, 0.01, 0.05, 0.1]}]
CV method and model
cv_method = KFold(n_splits=4, shuffle=True)
model = MLPClassifier()
Apply grid
grid = GridSearchCV(estimator=model, param_grid=tuned_params, cv=cv_method, scoring='accuracy')
grid.fit(X_train, y_train)
y_pred = grid.predict(X_test)
And if I run:
print(grid.best_score_)
print(accuracy_score(y_test, y_pred))
The result is 0.746478873239 and 0.902097902098
According to the doc "best_score_ : float, Score of best_estimator on the left out data". I assume it is the best accuracy among the ones obtained running the 8 different configuration as especified in tuned_params the number of times especified by KFold, on the left out data as especified by KFold. Am I right?
One more question. Is there a method to find the optimal size of test data to use in train_test_split which defaults to 0.25?
Thanks a lot
REFERENCES
The grid.best_score_
is the average of all cv folds for a single combination of the parameters you specify in the tuned_params
.
In order to access other relevant details about the grid searching process, you can look at the grid.cv_results_
attribute.
From the documentation of GridSearchCV:
cv_results_ : dict of numpy (masked) ndarrays
A dict with keys as column headers and values as columns, that can be imported into a pandas DataFrame
It contains keys like 'split0_test_score', 'split1_test_score' , 'mean_test_score', 'std_test_score', 'rank_test_score', 'split0_train_score', 'split1_train_score', 'mean_train_score', etc, which gives additional information about the whole execution.