What changes for Kubernetes in production?
A demo cluster forgives a lot; production does not. Running Kubernetes for real means setting resource requests and limits, defining health probes, configuring autoscaling, and rolling out changes without dropping traffic.
Why it matters
The gap between "it runs on my cluster" and "it survives real traffic and failures" is where most outages happen. Pods without limits starve their neighbors; pods without probes serve errors while looking healthy. These settings are what make a cluster reliable, and operating one is core DevOps work.
What to learn
- Resource requests and limits, and what happens without them
- Liveness, readiness, and startup probes
- Horizontal Pod Autoscaling
- Rolling updates and rollbacks
- Pod disruption budgets
- Node affinity and taints at a high level
- Graceful shutdown and connection draining
Common pitfall
Skipping a readiness probe, so Kubernetes sends traffic to a pod before it can serve it — users get errors during every deploy. A readiness probe tells the cluster when a pod is actually ready to receive requests, so rollouts shift traffic only to pods that can handle it.
Resources
Primary (free):
- Kubernetes — Configure probes · docs
- Kubernetes — Resource management · docs
- Kubernetes — Horizontal Pod Autoscaler · docs
Practice
Add resource requests and limits and a readiness probe to your Deployment. Set up a Horizontal Pod Autoscaler on CPU. Trigger a rolling update with a new image tag and watch traffic stay served throughout. Done when a deploy causes zero failed requests.
Outcomes
- Set resource requests and limits sensibly.
- Configure readiness and liveness probes.
- Autoscale a workload on a metric.
- Roll out and roll back without dropping traffic.