Skorch GridSearchCV: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan

Bruno Morabito picture Bruno Morabito · May 29, 2020 · Viewed 13.4k times · Source

I have data x of dimension (n_samples, time_steps, n_features) for the features and (n_samples, 1, n_labels) for the labels y.

From this I create a train, development and test pytorch datasets.

I want to use GridSearchCV to do a grid search on the hyperparameters. That is what I wrote:

'Define the network'
sampling_interval = 0.1
net = ConvNet(time_window, ny)
net.float()

'Split test training set'
# trainin test. In this case we take some experiements as test and some as trainint
train_set_split = 0.9
dev_set_split = 0.05
test_set_split = 0.05

# Creating data indices for training and validation splits:
dataset_size = x.shape[0]
indices = list(range(dataset_size))
np.random.shuffle(indices)
split1 = int(np.floor(train_set_split * dataset_size))
split2 = int(np.floor(dev_set_split * dataset_size))
split3 = int(np.floor(test_set_split * dataset_size))

train_indices, dev_indices, test_indices = indices[:split1], indices[split1:split1 + split2], indices[split1 + split2:]

'Create a dataset'
train_dataset = MyDataset(x[train_indices, :, :], y[train_indices], net.device, net.dtype)
dev_dataset = MyDataset(x[dev_indices, :, :], y[dev_indices], net.device, net.dtype)
test_dataset = MyDataset(x[test_indices, :, :], y[test_indices], net.device, net.dtype)


'Define the optimizer'
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)

'Define the loss function'
loss_func = torch.nn.MSELoss()

net_regr = NeuralNetRegressor(
    module=ConvNet,
    module__ny=ny,
    module__time_window=time_window,
    max_epochs=100,
    lr=0.1,
    train_split=predefined_split(dev_dataset),
    criterion=mean_squared_error,
    batch_size=batch_size,
)


params = {
    'lr': [0.01, 0.05, 0.1],
    'max_epochs': [100, 200, 300],
}

X_sl = SliceDataset(train_dataset, idx=0)  # idx=0 is the default
y_sl = SliceDataset(train_dataset, idx=1)

gs = GridSearchCV(net_regr, params, refit=False, verbose=4)

gs.fit(X_sl, y_sl)
print(gs.best_score_, gs.best_params_)

But I get this error

 FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: 
ValueError: The target data shouldn't be 1-dimensional but instead have 2 dimensions, with the second dimension having the same size as the number of regression targets (usually 1). Please reshape your target data to be 2-dimensional (e.g. y = y.reshape(-1, 1).

but the shape of the target is 2 dimensional

>>> y_sl[0].shape

torch.Size([1, 4])

where 4 is the number of targets (n_labels).

So I don't understand where this error comes from

Answer