Scaling
Horizontal Pod Autoscaling
- Uses the K8s Metrics Server
- Pods must have requests and limits defined
- The HPA checks the Metrics Server every 30 seconds
- Scale according to the min and max number or replicas defined
- Cooldown / Delay
- Prevent racing conditions
- Once a change has been made, HPA waits
- By default, the delay on scale up events is 3 minutes, and the delay on scale down events is 5 minutes
kubectl - HPA Cheat Sheet
# The imperative way
kubectl autoscale deployment [name] --cpu-percent=50 --min=3 --max=10
# The declarative way
kubectl apply -f [hap.yaml]
# Get the autoscaler status
kubectl get hpa [name]
# Delete the HPA
kubectl delete -f [hap.yaml]
# Delete the HPA
kubectl delete hpa [name]