
Interpretability is the degree to which a human can understand the cause of a decision.
Scenarios
- Why does our trained model approve or reject certain credit applications.
- Why does our trained model classify someone has Alzheimer's disease from a brain CT scan.
Local vs Global Interpretability
- Local: The model explains an individual prediction.
- Global: The model explains entire model behaviour.
Scope of Interpretability
- Algorithm Transparency: How much do we understand about the algorithm?
We know that Convolutional Neural Network learns simple features such as detecting edge and line on the lower layers and more complex features on the deeper layers. That's the overall knowledge we have on the general CNN without consider whether it is trained. - Global Interpretability (Overall): How does the trained model make predictions?
This level of interpretability focus on explaining the overall behaviour of a trained model. This is usually out of reach because complex models like Neural Networks has millions of features and learnable weights. Even a simpler linear model with more than 5 features is difficult for humans to visualise and comprehend. - Global Interpretability (Modular): How do parts of the model affect predictions?
There is a better chance of understanding part of the models such as a particular weight rather than entire model. However, focus on explaining a single weight without taking account of other weights are often ambiguous. E.g., the number of rooms is positively correlated with the house value given other predictors such as size, age are held constant, which is not the case for real applications. - Local Interpretability (Single Prediction): Why did the model make a certain prediction for an instance?
Since global interpretability is difficult, we narrow down the explainability to examine what the model predicts with only one instance/example. E.g., to explain the prediction of the model, we look at one particular instance which is a house of 100 squared meters. We then simulate the predicted price changes by increase or decreasing the house size by 10 squared meters gradually. This is often more accurate than global explanations. - Local Interpretability (Group of Predictions): Why did the model make specific predictions for a group of instances?
Model predictions can be explained with a group of instances/examples. First using global method (#3) on a group of instances (subset of examples), then use local method (#4) on individual instances. Finally, aggregate the entire group for explanation.
