Indy: Elasticity

Workloads may need more or less resources randomly based on demand.

overprovisioning = allocating more resources than you will ever need

underprovisioning: saves cost but needs to autoscale (giving more resources to the job as needd)

Elasticity consists of 2 parts

Machanisms to scale out and scale in a given job
- Can be sone manually in the low level like Amazon RedShift
Policies to decide when and how much to scale up/down

Elasticity is often combined with Multi-tenancy

Each team/job gets its own dedicated cluster
CONS for having a fixed size cluster for each job
- When a job is under-utilizing, another job cannot use the unused compute resources
- Cannot scale beyond subcluster size
SOLUTION is to have one consolidated cluster for all projects and jobs

Main problems in this space are related to the need for allocation and scheduling decisions which are all dynamic

Autoscaling solutions exists for Kubernetes, YARN, and other “distributed OSes”. There are domain-specific solutions tailored for ML workloads or microservices, but there are also generic solutions that work for all types of workloads but typically don’t perform as well as domain-specific solutions.