MLOpsAdvanced5h

Drift detection.

Noticing when the world changes and the model decays.

What is drift?

Drift is when the data a model sees in production moves away from the data it was trained on. Data drift is a change in the inputs; concept drift is a change in the relationship between inputs and the correct answer. Either quietly erodes a model's accuracy.

Why it matters

Models are trained on a snapshot of the past, but the world keeps moving — user behavior shifts, markets change, new categories appear. Drift is the main reason a once-good model gets worse, and detecting it is what tells you when to retrain. It closes the loop on responsible ML operations.

What to learn

  • Data drift versus concept drift
  • Statistical tests for distribution change
  • Monitoring input feature distributions
  • Detecting prediction drift
  • Setting thresholds that signal real change
  • Triggering retraining from drift signals
  • Tools like Evidently for drift reports

Common pitfall

Assuming a model stays as good as the day it launched and never planning for retraining. Drift is gradual and silent, so by the time someone notices, the model has been underperforming for a while. Set up drift detection and a retraining plan from the start, rather than treating decay as a surprise.

Resources

Primary (free):

Practice

Take a feature from your model's inputs and compare its distribution between the training data and a simulated "later" batch that has shifted. Use a statistical test or a drift report to flag the change, and define the threshold that would trigger retraining. Done when drift is detected automatically, not by eye.

Outcomes

  • Distinguish data drift from concept drift.
  • Monitor input and prediction distributions for change.
  • Set thresholds that signal meaningful drift.
  • Trigger retraining from drift detection.
Back to AI / ML roadmap