Phase 5

Evaluation

Test whether the result is good enough for the decision it supports.

What this phase actually is

Evaluation asks whether the model is good enough for the decision, not whether it looks impressive in isolation.

That requires technical metrics and business judgment. False positives and false negatives usually have different costs, and the “best” threshold depends on what action follows the prediction.

The useful output is a decision recommendation: use it, change it, collect more data, or return to the business question.

How this looks at Bertelsmann

Try it

Threshold Explorer

Subscribers saved33

Intervention cost€1224

Estimated net value€756

True positive117
False positive189
False negative103
True negative591

Pitfalls

  • Reporting accuracy when the business cost of errors is asymmetric.
  • Evaluating on examples that are easier than the real use case.
  • Letting a good chart hide a weak operational decision.