Orchestration Core

Scaling & scheduling

Lesson 5 of 5

What you'll learn

Use requests and limits to declare a Pod's resource footprint
Understand the HPA's desired-replicas formula at a high level
See how the scheduler places Pods on nodes by available CPU and memory

Every container should declare requests (guaranteed reservation, used for scheduling) and limits (hard ceiling, enforced at runtime). Requests are the contract the scheduler plans against; limits protect neighbors from a runaway Pod.

      resources:
        requests:
          cpu: "250m"
          memory: "256Mi"
        limits:
          cpu: "500m"
          memory: "512Mi"

Horizontal autoscaling

The HorizontalPodAutoscaler watches a metric (commonly CPU utilization against requests) and adjusts replicas. The core idea: desired = ceil(current * currentMetric / targetMetric). At 80% CPU with a 50% target and 4 replicas, it scales toward 7.

kubectl autoscale deploy/api --cpu-percent=50 --min=2 --max=10

Scheduling is bin-packing

When a Pod is created, the scheduler filters nodes that have enough unreserved CPU and memory for the Pod's requests, scores the survivors, and binds the Pod to the best one. If no node fits, the Pod stays Pending — the signal a cluster autoscaler uses to add nodes. This is a bin-packing problem: fit Pods into finite node capacity.

Requests are the planning currency

The scheduler reserves against requests, not actual usage. Under-requesting overcommits nodes and invites eviction; over-requesting wastes capacity. Tuning requests is one of the highest-leverage cost levers in a cluster.

Beyond the basics, Helm packages and templates these manifests for repeatable releases, and multi-cluster topologies (fleet management, federation, service mesh across clusters) are the natural next step as you scale beyond a single control plane.

A bin-packing scheduler

Run it. Each Pod binds to the first node with enough free CPU and memory for its requests; otherwise it stays Pending.

Loading editor…

Knowledge check

What value does the scheduler reserve against when deciding whether a Pod fits on a node?

Saved on this device. Sign in to sync your progress everywhere.

PreviousConfig, secrets & health probes