I have a quite simple ANN using Tensorflow and AdamOptimizer for a regression problem and I am now at the point to tune all the hyperparameters.
For now, I saw many different hyperparameters that I have to tune :
I have 2 questions :
1) Do you see any other hyperparameter I might have forgotten ?
2) For now, my tuning is quite "manual" and I am not sure I am not doing everything in a proper way. Is there a special order to tune the parameters ? E.g learning rate first, then batch size, then ... I am not sure that all these parameters are independent - in fact, I am quite sure that some of them are not. Which ones are clearly independent and which ones are clearly not independent ? Should we then tune them together ? Is there any paper or article which talks about properly tuning all the parameters in a special order ?
EDIT : Here are the graphs I got for different initial learning rates, batch sizes and regularization parameters. The purple curve is completely weird for me... Because the cost decreases like way slowly that the others, but it got stuck at a lower accuracy rate. Is it possible that the model is stuck in a local minimum ?
For the learning rate, I used the decay : LR(t) = LRI/sqrt(epoch)
Thanks for your help ! Paul
My general order is:
Dependencies:
I'd assume that the optimal values of
strongly depend on each other. I am not an expert on that field though.
As for your hyperparameters: