Data Science Asked on May 2, 2021
In transfer learning, often only the last layer of the network is retrained using gradient descent.
However, the last layer of a common neural network performs only a linear transformation, so why do we use gradient descent and not linear (or logistic) regression to finetune the last layer?
The common approach to fine-tuning an existing pre-trained neural network is the following:
A reason to use gradient descent over a different ML algorithm as you suggest is to enable further training after initial fine-tuning (step #4 above). However, it's not necessary to do this. The approach you suggest (to use take the output of the pre-trained model as input to another ML model) may provide satisfactory performance and be more computationally efficient.
Tradeoffs between these approaches is also discussed in the Keras Transfer Learning guide in the section on the "Typical Transfer Learning Workflow".
Answered by grov on May 2, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP