Author
Dhaval Mandalia
Dhaval Mandalia is co-founder of Arocom with 20+ years of experience. He works on various Artificial Intelligence projects in various domains. He enjoys data science, machine learning, data engineering, management and training. He writes blogs about data and management strategies and creates vlogs on various health initiatives. He has been a contributing member on various AI communities. Follow him on
Explainable Artificial Intelligence (XAI) and its applications
Artificial intelligence has come a long way in the last decade. From simple image recognition to Sophia, AI is growing by leaps and bounds. AI is adopted by various organizations for different goals. The figure below shows some of such goals by percentage.
AI adoption has begun in critical areas like healthcare, legal and financial industries. These industries are very complex and have major impact on the life of population. These industries face a critical challenge while using AI — Understanding how AI makes a decision.
For a long time, decision making by AI has remained a black box. Many data scientists have been quoting “It is more of an art than science”. However, for a highly regulated industry like healthcare, transparency of decision making by AI is imperative. And this will be applicable for even other industries with the advent of regulatory mechanisms like GDPR.
Thus, to find out how these models make decisions, and make sure that this decisioning process is aligned with the ethical, legal and procedural requirements of the organization, one needs to improve the interpretability quotient of these models.
While interpretability or explainability is usually mentioned in the context of models, it is actually applicable to the entire system, from features to logic, model parameters and to the model itself. There are five main qualities of an explainable AI:
- Transparency: The quality of AI to make user understand what drives the predictions, even if model is unknown or opaque. Key factors that drive the decision should be visible or derivable.
- Consistency: The explanations should be consistent across different executions of the same model.
- Generalizability: the explanation should be general and not devised separately for each model or model run
- Trust: Both model and explanation should match a human performance, even in case of making mistakes
- Fidelity: The explanation should represent what model did based on evaluation; it should not be a justification.
Below is the chart of complexity vs explainability of different models used by AI:
Learning algorithms can be split into three families:
- Rule based e.g. Decision trees
- Factor based e.g. Logistic regression
- Case based e.g. KNNs
y=w0+w1x1+w2x2+…+wnxn
Here coefficients describe the change of the response triggered by one unit increase of the independent variable. We can find the effective increase in Y for 1 unit increase of X1 when value of W1 is known.
Similarly, for a naïve Bayes algorithm assumes that features are independent of each other and they contribute independently to the output. It is easy to measure individual input of a feature and derive explanations.
Decision trees are like large “if — else” paths and random forests are like majority based on multiple stacks of decision trees.
All above models are relatively simpler to derive explanations.
Ensembles and deep learning fall under the criteria where explainability is a real challenge.
Approach to explaining a black box model is to extract information from a trained model to understand the prediction, without knowing how the model works. There may not be any hard requirement how these explanations are presented, however one should be able to answer, “Can I trust the model?”
There are two different types of interpretations, Global and Local.
Global interpretation is being able to explain the conditional interaction between dependent (response) variables and independent (predictor) variables based on a complete dataset.
Local interpretation is being able to explain conditional interaction between dependent (response) variables and independent (predictor) variables with respect to a single prediction
Below are some of the methods for local interpretation:
Prediction Decomposition:
Robnik-Sikonja and Kononenko proposed this method in 2008. It is the primary method that can be used to explain prediction outputs by measuring difference between the original prediction and the one made by omitting a set of features.
Let’s assume there is a classification model represented as f: X (input) à Y (output). For the given dataset of inputs, x is a datapoint, x ∈ X, which has the individual value m for its attribute Mi, where i=1,2,……,m,…. and is labelled with class y ∈ Y.
The prediction difference is calculated by computing the difference between model predicted probabilities with or without knowing Mi.
- If target model does not output the probability score, it needs to re-computed.
- Since we calculate without Mi model should be capable to compute NULL or NAN values.
- The output must be prediction in form of probability
Local Gradient Explanation Vector:
[Ref. Baehrens et al 2010] This method explains the local decision taken by arbitrary non-linear classification algorithms, using the local gradients that how a data point moves to change its predicted label.
Let’s assume a classifier trained on dataset X which outputs probability over the class labels Y. Local explanation vector is a derivative of the probability prediction function at a single point x. A large entry in this vector suggests a feature with big influence on the model decision.
Challenge:
- Again, the output must be in the form of probability
- If there is calibration applied to model output that will not be visible on explanation vectors and may cause deviation in explainability
LIME (Local Interpretable Model Agnostic) Framework:
LIME can approximate locally in the neighborhood of the prediction. It converts the dataset into interpretable data representations.
E.g.
- Image Classifier: create a binary vector indicating presence or absence of a contiguous patch of similar pixels.
- Text Classifier: Create a binary vector indicating the presence or absence of a word
The concept behind LIME is — It is easier to approximate a black-box model by a simple model locally.
Examining if explanations makes sense, one can decide if the model is trustworthy.
Example, using lime to identify words that have the highest impact on identifying if the question is “sincere” on Quora.
SHAP (SHapley Additive exPlanation) Framework:
In SHAP framework every feature used in the model is given a relative importance score called SHAP value. This score indicates how much that particular feature contributed to the decision of the model.
In above example Age and Education-Num are the top two features.
Feature’s importance is measured by calculating increase in the model’s prediction by perturbing the feature. If perturbing increases the model error, feature is important. If model error is unchanged feature is unimportant.
Interpreting a model locally is supposed to be easier than interpreting the model globally, but harder to maintain (thinking about the curse of dimensionality). Methods described below aim to explain the behavior of a model as a whole. However, the global approach is unable to capture the fine-grained interpretation, such as a feature might be important in this region but not at all in another.