Decorators in Python

Posts

Showing posts from May, 2022

TRANSFER learning vs fine tuning

May 25, 2022

Transfer learning is about “transferring” the learnt representations to another problem. For example one can use features from a pre-trained convolutional neural network (convNet) to power a linear support vector machine (SVM). In such a case the pre-trained model can be held fixed while the linear SVM weights can be updated. Fine tuning on the other hand is just about making some fine adjustments to further improve performance. For example, during transfer learning, you can unfreeze the pre-trained model and let it adapt more to the task at hand. Thus Transfer learning Is about projecting all new inputs through a pre-trained model. Like if we have a pre-trained model function http://www.w3.org/1998/Math/MathML "><mi>f</mi><mo stretchy="false">(</mo><mo stretchy="false">)</mo></math>"> f ( ) f() and wish to learn a new function http://www.w3.org/1998/Math/MathML "><mi>g</mi><mo str...

What is the difference between Sentence Encodings and Contextualized Word Embeddings?

May 25, 2022

A contextualized word embeding is a vector representing a word in a special context. The traditional word embeddings such as Word2Vec and GloVe generate one vector for each word, whereas a contextualized word embedding generates a vector for a word depending on the context. Consider the sentences The duck is swimming and You shall duck when someone shoots at you . With traditional word embeddings, the word vector for duck would be the same in both sentences, whereas it should be a different one in the contextualized case. While word embeddings encode words into a vector representation, there is also the question on how to represent a whole sentence in a way a computer can easily work with. These sentence encodings can embedd a whole sentence as one vector , doc2vec for example which generate a vector for a sentence. But also BERT generates a representation for the whole sentence, the [CLS]-token. So in short, a conextualized w...

Depth Data Science Interview

May 19, 2022

Here are 3 practical ML tips you don't read about in textbooks. I learned these while building ML solutions at 𝗣𝗮𝘆𝗣𝗮𝗹 and 𝗚𝗼𝗼𝗴𝗹𝗲. Want to level up your ML skills? Read on👇 (𝟭) 𝗩𝗮𝗿𝗶𝗮𝗯𝗹𝗲 𝗜𝗺𝗽𝗼𝗿𝘁𝗮𝗻𝗰𝗲 𝗼𝗻 𝗖𝗼𝗹𝗹𝗶𝗻𝗲𝗮𝗿 𝗙𝗲𝗮𝘁𝘂𝗿𝗲𝘀 ❗Don't trust variable importance from random forest blindly. The variable importance of a feature is increased whenever the model splits on the node. When two features are collinear, the variable importance of the features becomes diluted. ⭐ The better approach is to remove collinearity with variable selection using Pearson/Spearman correlation, VIF, or Lasso regression. Then, you can use the random forest or any other tree-based models to get the final model and interpret the variable importance of the features. (𝟮) 𝗥𝗮𝗻𝗱𝗼𝗺 𝗙𝗼𝗿𝗲𝘀𝘁 (𝗥𝗙) 𝗼𝗻 𝗖𝗼𝗻𝘁𝗶𝗻𝘂𝗼𝘂𝘀 𝗧𝗮𝗿𝗴𝗲𝘁 𝗩𝗮𝗿𝗶𝗮𝗯𝗹𝗲 ❗If you are using RF or other tree-based models (e.g. XGboost), be aware that your target prediction...