Similar to Custom cross validation split sklearn I want to define my own splits for GridSearchCV for which I need to customize the built in cross-validation iterator.
I want to pass my own set of train-test indices for cross validation to the GridSearch instead of allowing the iterator to determine them for me. I went through the available cv iterators on the sklearn documentation page but couldn't find it.
For example I want to implement something like this Data has 9 samples For 2 fold cv I create my own set of training-testing indices
>>> train_indices = [[1,3,5,7,9],[2,4,6,8]]
>>> test_indices = [[2,4,6,8],[1,3,5,7,9]]
1st fold^ 2nd fold^
>>> custom_cv = sklearn.cross_validation.customcv(train_indices,test_indices)
>>> clf = GridSearchCV(X,y,params,cv=custom_cv)
What can be used to work like customcv?
Actually, cross-validation iterators are just that: Iterators. They give back a tuple of train/test fold at each iteration. This should then work for you:
custom_cv = zip(train_indices, test_indices)
Also, for the specific case you are mentioning, you can do
import numpy as np
labels = np.arange(0, 10) % 2
from sklearn.cross_validation import LeaveOneLabelOut
cv = LeaveOneLabelOut(labels)
Observe that list(cv)
yields
[(array([1, 3, 5, 7, 9]), array([0, 2, 4, 6, 8])),
(array([0, 2, 4, 6, 8]), array([1, 3, 5, 7, 9]))]