>> Great points regarding aligning precision/recall of AI systems with actual hu...

tcmb · on June 18, 2018

To add to that, since I see it nowhere mentioned explicitly, neither in this thread nor in the article: the theoretical framework here is Signal Detection Theory (SDT) [1]. The problem at hand can be nicely visualized with two overlapping bell curves for the signal present/absent situations [2].

[1] https://en.wikipedia.org/wiki/Detection_theory [2] https://jdlee888.shinyapps.io/SDT_Demo/

sgt101 · on June 18, 2018

Two points - first, these frameworks don't really account for driving into concrete bollards vs. a shaky ride. Second none of these frameworks estimate production performance well; time after time after time I see classifiers that did "95%" in test and "88%" in prod and "that's pretty good!".

My point is that we are really bad at estimating performance of classifiers. Really bad, and mostly we are pretending it's fine.