In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?

I have a network made with InceptionNet, and for an input sample bx, I want to compute the gradients of the model output w.r.t. the hidden layer. I have the following code:

bx = tf.reshape(x_batch[0, :, :, :], (1, 299, 299, 3))

with tf.GradientTape() as gtape:
    preds = model(bx)
    print(preds.shape, end='  ')

    class_idx = np.argmax(preds[0])
    print(class_idx, end='   ')

    class_output = model.output[:, class_idx]
    print(class_output, end='   ')

    last_conv_layer = model.get_layer('inception_v3').get_layer('mixed10')

grads = gtape.gradient(class_output, last_conv_layer.output)#[0]

But, this will give None. I tried as well, but it still gives None.

Before trying GradientTape, I tried using tf.keras.backend.gradient but that gave an error as follows:

RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.

My model is as follows:


Model: "sequential_4"
Layer (type)                 Output Shape              Param #   
inception_v3 (Model)         (None, 1000)              23851784  
dense_5 (Dense)              (None, 2)                 2002      
Total params: 23,853,786
Trainable params: 23,819,354
Non-trainable params: 34,432

Any solution is appreciated. It doesn't have to be GradientTape, if there is any other way to compute these gradients.


Fantasty picture Fantasty · Jun 12, 2019

I had the same problem as you. I'm not sure if this is the cleanest way to solve the problem, but here's my solution.

I think the problem is that you need to pass along the actual return value of as an argument to Since all layers are called sequentially within the scope of the model(bx) call, you'll have to somehow inject some code into this inner scope. I did this using the following decorator:

def watch_layer(layer, tape):
    Make an intermediate hidden `layer` watchable by the `tape`.
    After calling this function, you can obtain the gradient with
    respect to the output of the `layer` by calling:

        grads = tape.gradient(..., layer.result)

    def decorator(func):
        def wrapper(*args, **kwargs):
            # Store the result of `` internally.
            layer.result = func(*args, **kwargs)
            # From this point onwards, watch this tensor.
            # Return the result to continue with the forward pass.
            return layer.result
        return wrapper = decorator(
    return layer

In your example, I believe the following should then work for you:

bx = tf.reshape(x_batch[0, :, :, :], (1, 299, 299, 3))
last_conv_layer = model.get_layer('inception_v3').get_layer('mixed10')
with tf.GradientTape() as gtape:
    # Make the `last_conv_layer` watchable
    watch_layer(last_conv_layer, gtape)  
    preds = model(bx)
    class_idx = np.argmax(preds[0])
    class_output = model.output[:, class_idx]
# Get the gradient w.r.t. the output of `last_conv_layer`
grads = gtape.gradient(class_output, last_conv_layer.result)  