Posts

Showing posts from August, 2021

One Stop solution to : Closures vs Decorators

 Closures vs Decorators One Stop solution to : Closures vs Decorators

AI ML NLP

 chris mcmorick james briggs 

What is defaultdict

  Usually, a Python dictionary throws a  KeyError  if you try to get an item with a key that is not currently in the dictionary. The  defaultdict  in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using  int() , which will return the integer object  0 . For the second example, default items are created using  list() , which returns a new empty list object. somedict = {} print (somedict[ 3 ]) # KeyError someddict = defaultdict( int ) print (someddict[ 3 ]) # print int(), thus 0 The  defaultdict  is a subdivision of the  dict  class. Its importance lies in the fact that it allows each  new  key to be given a  default ...

Challenges and limitations of Tokenization Task

challenges and limitations of the tokenization task.   In general, this task is used for text corpus written in English or French where these languages separate words by using white spaces, or punctuation marks to define the boundary of the sentences. Unfortunately, this method couldn’t be applicable for other languages like Chinese, Japanese, Korean Thai, Hindi, Urdu, Tamil, and others. This problem creates the need to develop a common tokenization tool that combines all languages.

Machine Learning Video Library

 Machine Learning Video Library 

setup python environment in VS code

 setup python environment in VS code

Cosine Similarity

Cosine Similarity (Tf-Idf)

WordNet in Python

Image
  WordNet   is an English dictionary which is a part of   Natural Language Tool Kit   (NLTK) for Python. This is an extensive library built to make   Natural Language Processing   (NLP) easy. Some basic functions will be discussed in this article. To start using WordNet, you have to import it first: from  nltk.corpus  import  wordnet Synsets and Lemmas In WordNet, similar words are grouped into a set known as a  Synset  (short for  Synonym-set ). Every Synset has a name, a part-of-speech, and a number. The words in a Synset are known as  Lemmas . Getting Synsets The function  wordnet.synsets('word')   returns an array  containing all the Synsets related to the word passed to it as the argument. Example: print ( wordnet.sysnets ( 'room' )) Output: [Synset(‘room.n.01’), Synset(‘room.n.02’), Synset(‘room.n.03’), Synset(‘room.n.04’), Synset(‘board.v.02’)] The method returned five Synsets; four have the name...