Write a PREreview

Kubernetes Autoscaling: A Comprehensive Review on Machine Learning Techniques

by Ratana Soth, Sokleng LY, and Leangsiv Sok

Posted: June 15, 2026
Server: Preprints.org
DOI: 10.20944/preprints202606.1094.v1

Kubernetes has become one of the most widely adopted orchestration platforms for deploying and managing cloud-native applications. Its native autoscaling mechanisms, including the Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), Cluster Autoscaler (CA), and event-driven autoscaling frameworks such as KEDA, provide elasticity for containerized workloads. However, most default autoscaling mechanisms remain reactive because scaling actions are triggered after resource utilization or external metrics exceed predefined thresholds. This reactive behavior can cause cold-start delay, over-provisioning, under-provisioning, service-level objective (SLO) violations, and inefficient resource utilization, especially in dynamic microservice and multi-tenant environments. Machine learning (ML) has therefore emerged as an important research direction for proactive and adaptive Kubernetes autoscaling. This paper presents a comprehensive review of ML-based Kubernetes autoscaling techniques using four major learning categories: supervised learning, unsupervised learning, reinforcement learning, and semi-supervised learning. Each category is further classified into Level-1 subcategories, including regression, classification, time-series forecasting, clustering, dimensionality reduction, association/pattern mining, value-based reinforcement learning, policy-based reinforcement learning, actor-critic methods, generative/consistency-based learning, graph-based learning, and pseudo-labeling/self-training. The review analyzes how these techniques support workload prediction, resource demand estimation, workload classification, anomaly detection, dependency modeling, adaptive scaling policies, and knowledge transfer. The findings show that supervised time-series forecasting is the most mature direction for proactive HPA, graph-based learning is increasingly important for dependency-aware microservice autoscaling, and reinforcement learning is promising for adaptive closed-loop resource optimization. Semi-supervised learning remains an emerging but important direction for environments where labeled Kubernetes telemetry is limited.

You can write a PREreview of Kubernetes Autoscaling: A Comprehensive Review on Machine Learning Techniques. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.