In simple words, what is the difference between cross-validation and grid search? How does grid search work? Should I do first a cross-validation and then a grid search?
Cross-validation is when you reserve part of your data to use in evaluating your model. There are different cross-validation methods. The simplest conceptually is to just take 70% (just making up a number here, it doesn't have to be 70%) of your data and use that for training, and then use the remaining 30% of the data to evaluate the model's performance. The reason you need different data for training and evaluating the model is to protect against overfitting. There are other (slightly more involved) cross-validation techniques, of course, like k-fold cross-validation, which often used in practice.
Grid search is a method to perform hyper-parameter optimisation, that is, it is a method to find the best combination of hyper-parameters (an example of an hyper-parameter is the learning rate of the optimiser), for a given model (e.g. a CNN) and test dataset. In this scenario, you have several models, each with a different combination of hyper-parameters. Each of these combinations of parameters, which correspond to a single model, can be said to lie on a point of a "grid". The goal is then to train each of these models and evaluate them e.g. using cross-validation. You then select the one that performed best.
To give a concrete example, if you're using a support vector machine, you could use different values for gamma
and C
. So, for example, you could have a grid with the following values for (gamma, C)
: (1, 1), (0.1, 1), (1, 10), (0.1, 10)
. It's a grid because it's like a product of [1, 0.1]
for gamma
and [1, 10]
for C
. Grid-search would basically train a SVM for each of these four pair of (gamma, C)
values, then evaluate it using cross-validation, and select the one that did best.