Performance factors affecting spark

 Spark Memory 

Performance is sensitive to 

  • application code, 
  • configuration settings, 
  • data layout and storage, 
  • multi-tenancy, 
  • resource allocation and 
  • elasticity in cloud deployments like Amazon EMR, Microsoft Azure, Google Dataproc, Qubole, etc.

tuning memory usage: 

  • the amount of memory used by your objects,
  •  the cost of accessing those objects, 
  • and the overhead of “garbage collection”

Comments

Popular posts from this blog

Read and Navigate XML - Beautiful Soup

difference-between-stream-processing-and-message-processing

WordNet in Python