Now, everything is connected, but this is not primarily about persistent research misconceptions such as statistical significance.
Instead it is about (inherently) interpretable ML versus (misleading with some nonzero frequency) explanatory ML that I previously blogged on just over a year ago.
That was when I first become aware of work by Cynthia Rudin (Duke) that argues upgraded versions of easy to interpret machine learning (ML) technologies (e.g.
Cart constrained optimisation to get sparse rule lists, trees, linear integer models, etc.) can offer similar predictive performance of new(er) ML (e.g. deep neural nets) with the added benefit of inherent interpretability. In that initial post, I overlooked the need to define (inherently) interpretability ML as ML where the connection between the inputs given and prediction made is direct. That is, it is simply clear how the ML predicts but not necessarily why such predictions would make sense – understanding how the model works but not an explanation of how the world works.
What’s new? Not much and that’s troubling.
For instance, policy makers are still widely accepting black box models without significant attempts at getting interpretable (rather than explainable) models that would be even better. Apparently, the current lack of interpretable models with comparable performance to black box models in some high profile applications is being taken as the usual situation without question. To dismiss consideration of interpretable models? Or maybe it is just wishful thinking?
Now there have been both improvements in interpretable methods and their exposition.
For instance, an interpretable ML achieved comparable accuracy to black box ML and received the FICO Recognition Award. That acknowledging the interpretable submission for going above and beyond expectations with a fully transparent global model that did not need explanation. Additionally there was a user-friendly dashboard to allow users to explore the global model and its interpretations. So a nice very visible success.
Additionally, theoretical work has proceeded to discern if accurate interpretable models could possibly exist in many if most applications. It avoids Occham’s-Razor-style arguments about the world being truly simple by using a technical argument about function classes, and in particular, Rashomon Sets.
As for their exposition, there is now a succinct 10 minute youtube Please Stop Doing “Explainable” ML that hits many of the key points along with a highly readable technical exposition that further fleshes out these points: Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead .
However as pointed out in the paper the problem persists that “Black box machine learning models are currently being used for high stakes decision-making throughout society, causing problems throughout healthcare, criminal justice, and in other domains. People have hoped that creating methods for explaining these black box models will alleviate some of these problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practices and can potentially cause catastrophic harm to society”.
This might be best brought home by googling “interpretable ML” – most of what you get is all about creating methods for explaining black box models that might mitigate some of the problems often using the synonym for explaining – interpreting. A semantic ploy to keep most on the same low road of trying to salvage black methods in applications were intepretable models have already been developed or likely can be?
Recall I started this post with “everything is connected” – the connection I see here is with persistent research misconceptions.
“Why do such important misconceptions about research persist? To a large extent these misconceptions represent substitutes for more thoughtful and difficult tasks. … These misconceptions involve taking the low road [to achieving and maintaining academic prestige in methodology] , but when that road is crowded with others taking the same path, there may be little reason to question the route.” Six Persistent Research Misconceptions. Kenneth J. Rothman
Same old same old of these misconceptions [better to explain black boxes than switch] representing substitutes for more thoughtful and difficult tasks [learning methods to obtain interpretable ML and getting them]. The more widely shared the misconception (more crowded the low road) the less the need to be thoughtful. The persistence will likely persist for a long time 🙁
Now trying to stay with the inertia, especially if it is substantial, has its advantages. Might even be optimal in the short run. It’s the accountant/economist’s dilemma of disregarding
suck sunk costs.
A forthright quote about this is available from a well established AI researcher Been Kim: “There is a branch of interpretability research that focuses on building inherently interpretable models … Right now you have AI models everywhere that are already built, and are already being used for important purposes, without having considered interpretability from the beginning. … You could say, “Interpretability is so useful, let me build you another model to replace the one you already have.” Well, good luck with that [convincing someone to not just fix something that not broken, but actually throw it out and replace it?!!!]”.
For me, the bottom line of this post is that in settings where there are yet a lot of black box models being used – avoid those sunk costs if at all possible – they really su*k!