What is the definition of a non-trainable parameter?

Question 1

What is the definition of a non-trainable parameter?

tensorflow deep-learning keras theano caffe

TheWho · Nov 15, 2017 · Viewed 18.2k times · Source

Answer

Answer

In keras, non-trainable parameters (as shown in model.summary()) means the number of weights that are not updated during training with backpropagation.

There are mainly two types of non-trainable weights:

The ones that you have chosen to keep constant when training. This means that keras won't update these weights during training at all.
The ones that work like statistics in BatchNormalization layers. They're updated with mean and variance, but they're not "trained with backpropagation".

Weights are the values inside the network that perform the operations and can be adjusted to result in what we want. The backpropagation algorithm changes the weights towards a lower error at the end.

By default, all weights in a keras model are trainable.

When you create layers, internally it creates its own weights and they're trainable. (The backpropagation algorithm will update these weights)

When you make them untrainable, the algorithm will not update these weights anymore. This is useful, for instance, when you want a convolutional layer with a specific filter, like a Sobel filter, for instance. You don't want the training to change this operation, so these weights/filters should be kept constant.

There is a lot of other reasons why you might want to make weights untrainable.

Changing parameters:

For deciding whether weights are trainable or not, you take layers from the model and set trainable:

model.get_layer(layerName).trainable = False #or True

This must be done before compilation.

Question 2

What is the definition of non-trainable parameter in a model?

For example, while you are building your own model, its value is 0 as a default, but when you want to use an inception model, it is becoming something else rather than 0. What would be the reason behind it?

What is the definition of a non-trainable parameter?

Answer

Related questions