TRANSFER learning vs fine tuning

 Transfer learning is about “transferring” the learnt representations to another problem. For example one can use features from a pre-trained convolutional neural network (convNet) to power a linear support vector machine (SVM). In such a case the pre-trained model can be held fixed while the linear SVM weights can be updated.

Fine tuning on the other hand is just about making some fine adjustments to further improve performance. For example, during transfer learning, you can unfreeze the pre-trained model and let it adapt more to the task at hand.

Thus

  • Transfer learning
    • Is about projecting all new inputs through a pre-trained model. Like if we have a pre-trained model function http://www.w3.org/1998/Math/MathML"><mi>f</mi><mo stretchy="false">(</mo><mo stretchy="false">)</mo></math>">f()f() and wish to learn a new function http://www.w3.org/1998/Math/MathML"><mi>g</mi><mo stretchy="false">(</mo><mo stretchy="false">)</mo></math>">g()g(), we can simplify http://www.w3.org/1998/Math/MathML"><mi>g</mi><mo stretchy="false">(</mo><mo stretchy="false">)</mo></math>">g()g() by http://www.w3.org/1998/Math/MathML"><mi>g</mi><mo stretchy="false">(</mo><mi>f</mi><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo></math>">g(f(x))g(f(x)). This way http://www.w3.org/1998/Math/MathML"><mi>g</mi><mo stretchy="false">(</mo><mo stretchy="false">)</mo></math>">g()g() sees all the data through http://www.w3.org/1998/Math/MathML"><mi>f</mi><mo stretchy="false">(</mo><mo stretchy="false">)</mo></math>">f()f().
    • Can involve fine-tuning the pre-trained model. We can fine-tune http://www.w3.org/1998/Math/MathML"><mi>f</mi><mo stretchy="false">(</mo><mo stretchy="false">)</mo></math>">f()f() as well during the learning process.
  • In machine learning (ML) the learning process itself can qualify as just fine-tuning because the update process takes very tiny steps in the opposite direction of the gradient. But fine-tuning is normally reserved as the final stage that involves setting the learning rate very low so as to fine adjust the weights. The learning process normally starts with a very poor initial state with a fairly large learning rate and then may involve a fine tuning phase with a small learning rate. You can in fact split the learning phase into multiple fine-tuning phases each with a smaller learning rate than the last.

Thus there is a difference between the two.

Comments

Popular posts from this blog

Read and Navigate XML - Beautiful Soup

difference-between-stream-processing-and-message-processing

WordNet in Python