jobs stage tasks

 https://www.linkedin.com/pulse/shufflehashjoin-what-why-when-akhil-pathirippilly-mana/


  • jobs - which are started whenever the code which is run on the driver node encounters an action (i.e. collect()take() etc.) and are supposed to compute a value and return it to the driver
  • stages - which are composed of sequences of tasks between which no data shuffling is required
  • tasks - computations of the same type which can run in parallel on worker nodes

Comments

Popular posts from this blog

Read and Navigate XML - Beautiful Soup

difference-between-stream-processing-and-message-processing

WordNet in Python