How to fine-tune ResNet50 in Keras?

J.K. picture J.K. · May 9, 2017 · Viewed 12.1k times · Source

Im trying to finetune the existing models in Keras to classify my own dataset. Till now I have tried the following code (taken from Keras docs: https://keras.io/applications/) in which Inception V3 is fine-tuned on a new set of classes.

from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K

# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)

# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
    layer.trainable = False

# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# train the model on the new data for a few epochs
model.fit_generator(...)

# at this point, the top layers are well trained and we can start fine-tuning
# convolutional layers from inception V3. We will freeze the bottom N layers
# and train the remaining top layers.

# let's visualize layer names and layer indices to see how many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
   print(i, layer.name)

# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 172 layers and unfreeze the rest:
for layer in model.layers[:172]:
   layer.trainable = False
for layer in model.layers[172:]:
   layer.trainable = True

# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')

# we train our model again (this time fine-tuning the top 2 inception blocks
# alongside the top Dense layers
model.fit_generator(...)

Can anyone plz guide me what changes should I do in the above code so as to fine-tune ResNet50 model present in Keras.

Thanks in advance.

Answer

Phil picture Phil · May 15, 2017

It is difficult to make out a specific question, have you tried anything more than just copying the code without any changes?

That said, there is an abundance of problems in the code: It is a simple copy/paste from keras.io, not functional as it is, and needs some adaption before working at all (regardless of using ResNet50 or InceptionV3):

1): You need to define the input_shape when loading InceptionV3, specifically replace base_model = InceptionV3(weights='imagenet', include_top=False) with base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(299,299,3))

2): Further, you need to adapt the number of the classes in the last added layer, e.g. if you have only 2 classes to: predictions = Dense(2, activation='softmax')(x)

3): Change the loss-function when compiling your model from categorical_crossentropy to sparse_categorical_crossentropy

4): Most importantly, you need to define the fit_generator before calling model.fit_generator() and add steps_per_epoch. If you have your training images in ./data/train with every category in a different subfolder, this can be done e.g. like this:

from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
     "./data/train",
    target_size=(299, 299),
    batch_size=50,
    class_mode='binary')
model.fit_generator(train_generator, steps_per_epoch=100)

This of course only does basic training, you will for example need to define save calls to hold on to the trained weights. Only if you get the code working for InceptionV3 with the changes above I suggest to proceed to work on implementing this for ResNet50: As a start you can replace InceptionV3() with ResNet50() (of course only after from keras.applications.resnet50 import ResNet50), and change the input_shape to (224,224,3) and target_size to (224,244).

The above mentioned code-changes should work on Python 3.5.3 / Keras 2.0 / Tensorflow backend.