Why do we have to normalize the input for an artificial neural network?

karla picture karla · Jan 12, 2011 · Viewed 111.5k times · Source

It is a principal question, regarding the theory of neural networks:

Why do we have to normalize the input for a neural network?

I understand that sometimes, when for example the input values are non-numerical a certain transformation must be performed, but when we have a numerical input? Why the numbers must be in a certain interval?

What will happen if the data is not normalized?

Answer

finnw picture finnw · Jan 12, 2011

It's explained well here.

If the input variables are combined linearly, as in an MLP [multilayer perceptron], then it is rarely strictly necessary to standardize the inputs, at least in theory. The reason is that any rescaling of an input vector can be effectively undone by changing the corresponding weights and biases, leaving you with the exact same outputs as you had before. However, there are a variety of practical reasons why standardizing the inputs can make training faster and reduce the chances of getting stuck in local optima. Also, weight decay and Bayesian estimation can be done more conveniently with standardized inputs.