Phase 5
Evaluation
Test whether the result is good enough for the decision it supports.
What this phase actually is
Evaluation asks whether the model is good enough for the decision, not whether it looks impressive in isolation.
That requires technical metrics and business judgment. False positives and false negatives usually have different costs, and the “best” threshold depends on what action follows the prediction.
The useful output is a decision recommendation: use it, change it, collect more data, or return to the business question.
How this looks at Bertelsmann
Try it
Threshold Explorer
Subscribers saved33
Intervention cost€1224
Estimated net value€756
True positive117
False positive189
False negative103
True negative591
Pitfalls
- Reporting accuracy when the business cost of errors is asymmetric.
- Evaluating on examples that are easier than the real use case.
- Letting a good chart hide a weak operational decision.