Data Science Asked on April 16, 2021
I am trying to fine tune some code from a Kaggle kernel. The model uses pretrained VGG16 weights (via ‘imagenet’) for transfer learning. However, I notice there is no layer freezing of layers as is recommended in a keras blog. One approach would be to freeze the all of the VGG16 layers and use only the last 4 layers in the code during compilation, for example:
for layer in model.layers[:-5]:
layer.trainable = False
Supposedly, this will use the imagenet weights for the top layers and train only the last 5 layers. What are the consequences of not freezing the VGG16 layers?
from keras.models import Sequential, Model, load_model
from keras import applications
from keras import optimizers
from keras.layers import Dropout, Flatten, Dense
img_rows, img_cols, img_channel = 224, 224, 3
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_rows, img_cols, img_channel))
add_model = Sequential()
add_model.add(Flatten(input_shape=base_model.output_shape[1:]))
add_model.add(Dense(256, activation='relu'))
add_model.add(Dense(1, activation='sigmoid'))
model = Model(inputs=base_model.input, outputs=add_model(base_model.output))
model.compile(loss='binary_crossentropy', optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
model.summary()
I think that the main consequences are the following:
To wrap up, I think that the main point is to check if your images are comparable to the ones in imagenet. In this case, I would not unfreeze many layers. Otherwise, unfreeze everything but get ready to wait for a long training time.
Correct answer by David Masip on April 16, 2021
The reason it can save computation time is because your network would already be able to extract generic features from your dataset. The network will not have to learn extracting generic features from scratch.
A neural network works by abstracting and transforming information in steps. In the initial layers, the features extracted are pretty generic, and independent of the particular task. It is the later layers which are much more tuned specific to the particular task. So by freezing the initial stages, you get a network which can already extract meaningful general features. You would unfreeze the last few stages(or just the new untrained layers), which would be tuned for your particular task.
Also, I would not recommend unfreezing all layers if you have any new/untrained layers in your model. These untrained layers will have large gradients in the first few epocs, and your model will train as if initialized by random(and not pre-trained) weights.
Answered by vivek on April 16, 2021
The result of not freezing the pretrained layers will be to destroy the information they contain during future training rounds.
See Transfer learning and fine-tuning guide from TensorFlow.
Answered by Yacine Rouizi on April 16, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP