Drift detection · AI / ML · Code with Animation

What is drift?

Drift is when the data a model sees in production moves away from the data it was trained on. Data drift is a change in the inputs; concept drift is a change in the relationship between inputs and the correct answer. Either quietly erodes a model's accuracy.

Why it matters

Models are trained on a snapshot of the past, but the world keeps moving — user behavior shifts, markets change, new categories appear. Drift is the main reason a once-good model gets worse, and detecting it is what tells you when to retrain. It closes the loop on responsible ML operations.

What to learn

Data drift versus concept drift
Statistical tests for distribution change
Monitoring input feature distributions
Detecting prediction drift
Setting thresholds that signal real change
Triggering retraining from drift signals
Tools like Evidently for drift reports

Common pitfall

Assuming a model stays as good as the day it launched and never planning for retraining. Drift is gradual and silent, so by the time someone notices, the model has been underperforming for a while. Set up drift detection and a retraining plan from the start, rather than treating decay as a surprise.

Resources

Primary (free):

Evidently AI — Data drift · docs
Google — Data distribution shifts · docs
Made With ML — Monitoring · course

Practice

Take a feature from your model's inputs and compare its distribution between the training data and a simulated "later" batch that has shifted. Use a statistical test or a drift report to flag the change, and define the threshold that would trigger retraining. Done when drift is detected automatically, not by eye.

Outcomes

Distinguish data drift from concept drift.
Monitor input and prediction distributions for change.
Set thresholds that signal meaningful drift.
Trigger retraining from drift detection.