Workloads may need more or less resources randomly based on demand.

overprovisioning = allocating more resources than you will ever need

underprovisioning: saves cost but needs to autoscale (giving more resources to the job as needd)

Elasticity consists of 2 parts

Elasticity is often combined with Multi-tenancy

Main problems in this space are related to the need for allocation and scheduling decisions which are all dynamic

Autoscaling solutions exists for Kubernetes, YARN, and other “distributed OSes”. There are domain-specific solutions tailored for ML workloads or microservices, but there are also generic solutions that work for all types of workloads but typically don’t perform as well as domain-specific solutions.