ReLU derivative in backpropagation

Gergely Papp picture Gergely Papp · Feb 4, 2017 · Viewed 25.4k times · Source

I am about making backpropagation on a neural network that uses ReLU. In a previous project of mine, I did it on a network that was using Sigmoid activation function, but now I'm a little bit confused, since ReLU doesn't have a derivative.

Here's an image about how weight5 contributes to the total error. In this example, out/net = a*(1 - a) if I use sigmoid function.

What should I write instead of "a*(1 - a)" to make the backpropagation work?

Answer

malioboro picture malioboro · Feb 5, 2017

since ReLU doesn't have a derivative.

No, ReLU has derivative. I assumed you are using ReLU function f(x)=max(0,x). It means if x<=0 then f(x)=0, else f(x)=x. In the first case, when x<0 so the derivative of f(x) with respect to x gives result f'(x)=0. In the second case, it's clear to compute f'(x)=1.