Logs, metrics & traces · DevOps · Code with Animation

What are the three pillars?

Observability rests on three kinds of data. Logs are timestamped records of what happened. Metrics are numbers over time, like request rate or CPU. Traces follow a single request across services. Together they let you understand a running system from the outside.

Why it matters

You cannot fix what you cannot see, and in production you only see what your systems emit. Each pillar answers a different question — logs the "what," metrics the "how much," traces the "where." Knowing which to reach for turns a vague outage into a targeted fix.

What to learn

Logs: structured, leveled, and searchable
Metrics: counters, gauges, histograms
Traces and spans across services
When to use each pillar
Cardinality and why high-cardinality labels hurt
Correlation: tying logs, metrics, and traces together
OpenTelemetry as a vendor-neutral standard

Common pitfall

Logging everything at full verbosity in production. The noise buries the signal, the storage bill explodes, and high-cardinality fields can overwhelm the system. Log at sensible levels, use metrics for high-volume counts rather than a log line per event, and reserve verbose logging for when you are actively debugging.

Resources

Primary (free):

OpenTelemetry — Documentation · docs
Google SRE — Monitoring distributed systems · docs
Grafana — Observability fundamentals · docs

Practice

Instrument a small service to emit all three: structured logs, a request-count metric, and a trace across one downstream call. For a sample failure, decide which pillar you would consult first and why. Done when you can answer "what," "how much," and "where" from your own telemetry.

Outcomes

Explain what logs, metrics, and traces each answer.
Choose the right pillar for a given question.
Avoid high-cardinality and over-verbose logging.
Correlate the three pillars to debug an incident.