The concept of straight through estimator (STE)

Amir picture Amir · Jul 13, 2016 · Viewed 8.5k times · Source

I have seen straight through estimator (STE) in many Neural Network related papers e.g. this and this. But I cannot understand the concept. I wonder if anyone could explain STE or refer me to a simple resource?

Answer

Chinni picture Chinni · Apr 18, 2018

A straight through estimator is a way of estimating gradients for a threshold operation in a neural network. The threshold could be as simple as the following function,

enter image description here

As we can see, the derivative of this threshold function will 0 and during back-propagation, the network will not learn anything since it gets 0 gradients and the weights won't get updated.

The concept of a straight through estimator is that you set the incoming gradients to a threshold function equal to it's outgoing gradients, disregarding the derivative of the threshold function itself. This has been shown to perform well in the results (Figure 2) in this paper you have referenced.