“Why Should I Trust You?” Explaining the Predictions of Any Classifier (LIME)

Week 1.

What problem does this paper solve?

The goal is to explain the predictions made by a Machine Learning model. By explain, we mean, how did the features/predictors affect the final outcome of a test point. The authors come up with a framework to perform this task and call it LIME - Local Interpretable Model Explanations.

What is LIME?

LIME is an explainability tool. It is model agnostic, meaning, it does not depend on the type of trained model. In other words, it will work for any ML/DL model. It is also a post-hoc explainability method. This means it explains the predictions after the model is trained. Unlike, say a method that uses explainable models during training itself like a Linear Model. Finally, it generates local explanations. That is, it generates explanations for a particular test point as opposed to a global method which would seek to explain the model behaviour over multiple points (such as the training data).

How does LIME work?

LIME takes 3 inputs-

The trained model
The test point
The training data*

The original paper does not express the need for the training data (3). However, in the implementation of LIME for tabular data, the training data is required. That implemented methodology is described in this section.

LIME takes in the test point (2) for which the prediction generated by the trained model (1) has to be explained. As a first step, it generates the locality around the test point. To do this, it treats each feature independently and generates samples by perturbing the test point (3) with noise sampled from a unit Gaussian $N(0,1)$ . It then inversely scales the sampled noise for each feature using the mean and standard deviation of the feature computed from the training data(3). Next, it weighs the sampled points by distance from the test point. The ones closer to the test point are given greater weightage. This gets us the features(X) for the locality. To get the labels(y) for this locality, it gets the output using the trained model's prediction function. Now, we have both the X(features) and the y(labels) for the locality. Finally, it takes this (X,y) and fits Weighted Linear models to it. The coefficients of this Linear model are given out as explanations as linear models are inherently interpretable.

Pros

One of the first model explainability methods, hence widely used.
A very well maintained and feature rich Python library.
Explanations are easy to understand.
Model Agnostic.
Works for MultiClass tasks, Text and Vision models too.

Cons

The method of generating locality assumes feature independence and also generates Out Of Distribution data.
Linear Model cannot capture non linear localities hence explanations may not be right in such cases.

How To Run LIME

Install the LIME package first using

pip install lime

from lime import lime_tabular

lm=lime_tabular.LimeTabularExplainer(training_data)
exp = lm.explain_instance(test_point,trained_model.predict_function)

print(exp.as_list())

Resources

The LIME Github Repository

The original paper by Marco Tulio Ribeiro, Sameer Singh and Carlos Guestrin.(2016)

PreviousPaper a Week NextBorn Again Tree Ensembles

Last updated 4 years ago

Was this helpful?