Kubernetes in production · DevOps · Code with Animation

What changes for Kubernetes in production?

A demo cluster forgives a lot; production does not. Running Kubernetes for real means setting resource requests and limits, defining health probes, configuring autoscaling, and rolling out changes without dropping traffic.

Why it matters

The gap between "it runs on my cluster" and "it survives real traffic and failures" is where most outages happen. Pods without limits starve their neighbors; pods without probes serve errors while looking healthy. These settings are what make a cluster reliable, and operating one is core DevOps work.

What to learn

Resource requests and limits, and what happens without them
Liveness, readiness, and startup probes
Horizontal Pod Autoscaling
Rolling updates and rollbacks
Pod disruption budgets
Node affinity and taints at a high level
Graceful shutdown and connection draining

Common pitfall

Skipping a readiness probe, so Kubernetes sends traffic to a pod before it can serve it — users get errors during every deploy. A readiness probe tells the cluster when a pod is actually ready to receive requests, so rollouts shift traffic only to pods that can handle it.

Resources

Primary (free):

Kubernetes — Configure probes · docs
Kubernetes — Resource management · docs
Kubernetes — Horizontal Pod Autoscaler · docs

Practice

Add resource requests and limits and a readiness probe to your Deployment. Set up a Horizontal Pod Autoscaler on CPU. Trigger a rolling update with a new image tag and watch traffic stay served throughout. Done when a deploy causes zero failed requests.

Outcomes

Set resource requests and limits sensibly.
Configure readiness and liveness probes.
Autoscale a workload on a metric.
Roll out and roll back without dropping traffic.