Prometheus & Grafana · DevOps · Code with Animation

What are Prometheus and Grafana?

Prometheus collects metrics by scraping endpoints your services expose, stores them as time series, and lets you query them with PromQL. Grafana turns those queries into dashboards and alerts. Together they are the de facto open-source monitoring stack.

Why it matters

Metrics are only useful if you can collect, query, and visualize them, and this pair is what most teams use to do it. Building a dashboard that surfaces real problems — and an alert that fires on them — is a daily DevOps task and a common interview topic.

What to learn

The pull model: Prometheus scrapes targets
Exposing metrics in the Prometheus format
Metric types: counter, gauge, histogram, summary
PromQL basics for querying
Recording rules and alerting rules
Building Grafana dashboards
Alertmanager and routing notifications

Common pitfall

Building dashboards full of every metric you can graph, so the one that matters is lost in a wall of charts. A good dashboard answers "is the service healthy?" at a glance — a few key signals like error rate, latency, and saturation. Vanity metrics that nobody acts on are clutter.

Resources

Primary (free):

Practice

Expose a metric from a service, scrape it with Prometheus, and write a PromQL query for its request rate. Build a Grafana dashboard with error rate and latency, and add an alert that fires when the error rate crosses a threshold. Done when the alert fires on a deliberately broken request.

Outcomes

Explain Prometheus's pull-based scraping model.
Expose and query metrics with PromQL.
Build a focused dashboard of key signals.
Define an alert that fires on a real problem.