Workloads may need more or less resources randomly based on demand.
overprovisioning = allocating more resources than you will ever need
underprovisioning: saves cost but needs to autoscale (giving more resources to the job as needd)
Elasticity consists of 2 parts
Elasticity is often combined with Multi-tenancy
Main problems in this space are related to the need for allocation and scheduling decisions which are all dynamic
Autoscaling solutions exists for Kubernetes, YARN, and other “distributed OSes”. There are domain-specific solutions tailored for ML workloads or microservices, but there are also generic solutions that work for all types of workloads but typically don’t perform as well as domain-specific solutions.