Last updated
Last updated
The paper introduces a new Open Source package called Interpret ML. It also presents a new learning algorithm called Explainable Boosting Machine (EBM).
The goal of this package is to unify explainability techniques under a common API. It contains both glass-box models (inherently interpretable ML models) and explainability methods to apply on blackbox algorithms. In addition to a selection of algorithms, it also provides inbuilt visualisation techniques for ease of comparison and assessment. The package also contains the first implementation of the Explainable Boosting Machine (EBM) algorithm.
The authors highlight the 4 key design principles they use to design the Interpret ML package-
Ease of Comparison Since users want to compare algorithms and see what fits best to their use case, the package is designed to make this comparison seamless with a uniform API.
Staying true to the source The implementations of algorithms that are integrated into the package should be accurate and as close as possible to their original implementations. Doing this is not always easy when one is trying to fit all of them in one common API.
Play nice with others Building the library on top of tools that are widely adopted by the community and reusing things that exist instead of recreating them. A good example of this is that one is able to get visualizations inside the jupyter notebook itself.
Take what you want The end user should be able to extract and use what they find essential to their use case without the overhead of everything else. For instance, in a server on a production environment , one might not need the visualisations and only use the raw values.
The whole package is broadly into two categories of capabilities-
Glassbox Algorithms Algorithms that inherently produce interpretable and explainable models. The best examples of these are linear models and low depth decision trees. The new addition to this is the Explainable Boosting Machine(EBM) algorithm which is a strong yet inherently interpretable model. The EBM algorithm's experimental performances are found to be at par with models like XGBoost.
In common literature, glassbox algorithms are also called whitebox algorithms
Blackbox interpretability algorithms These interpretability algorithms are not the model themselves. Instead, these algorithms can explain any machine learning/deep learning model. In other words, these are model agnostic explanation algorithms. The package includes the implementation of LIME, SHAP, Partial Dependence Plots, etc under this common API.
The package gives both raw outputs and visualisations if they are needed. The user can choose what kind of outputs they want. This is the design diagram of Interpret ML
The EBM Algorithm is a fast implementation of GA²M algorithm. In turn, GA²M algorithm is an extension of the GAM algorithm. Therefore, let's start with what the GAM algorithm is.
GAM stands for Generalized Additive Model. It is more flexible than logistic regression, but still interpretable. The hypothesis function for GAM is as follows-
The key part to notice is that instead of a linear term 𝛽ixi for a feature, now we have a function fi(xi). We will come back later to how this function is computed in EBM.
One limitation of GAM is that each feature function is learned independently. This prevents the model from capturing interactions between features and pushes the accuracy down.
GA²M seeks to improve this. To do so, it also considers some pairwise interaction terms in addition to the function learnt for each feature. This is not an easy problem to solve because there are a larger number of interaction pairs to consider which increases compute time drastically. In GA²M, they use FAST algorithm to pick up useful interactions efficiently. This is the hypothesis function for GA²M. Note the extra pairwise interaction terms.
By adding pairwise interaction terms, we get a stronger model while still being interpretable. This is because, one can use a heatmap and visualize two features in 2D and their effect on the output clearly.
Finally, let us talk about the EBM algorithm. In EBM, we learn each feature function fi(xi) using methods such as bagging and gradient boosting. To make the learning independent of the order of features, the authors use a very low learning rate and cycle through feature in round robin fashion. The feature function fi for each feature represents how much each feature contributes to the model’s prediction for the problem and is hence directly interpretable. One can plot the individual function for each feature to visualize how it affects the prediction. The pairwise interaction terms can also be visualized on a heatmap as described earlier.
This implementation of EBM is also parallelizable which is invaluable in large scale systems. It also has an added advantage of having an extremely fast inference time.
- InterpretML: A Unified Framework for Machine Learning Interpretability by Harsha Nori, Samuel Jenkins, Paul Koch, Rich Caruana
- Interpret-ml
by Yin Lou, Rich Caruana,Johannes Gehrke, Giles Hooker.
Week 4.