A generic explainability architecture for explaining text machine learning models
text_explainability
provides a generic architecture from which well-known state-of-the-art explainability approaches for text can be composed. This modular architecture allows components to be swapped out and combined, to quickly develop new types of explainability approaches for (natural language) text, or to improve a plethora of approaches by improving a single module.
Several example methods are included, which provide local explanations (explaining the prediction of a single instance, e.g. LIME
and SHAP
) or global explanations (explaining the dataset, or model behavior on the dataset, e.g. TokenFrequency
and MMDCritic
). By replacing the default modules (e.g. local data generation, global data sampling or improved embedding methods), these methods can be improved upon or new methods can be introduced.
© Marcel Robeer, 2021
Quick tour
Local explanation: explain a models’ prediction on a given sample, self-provided or from a dataset.
from text_explainability import LIME, LocalTree
# Define sample to explain
sample = 'Explain why this is positive and not negative!'
# LIME explanation (local feature importance)
LIME().explain(sample, model).scores
# List of local rules, extracted from tree
LocalTree().explain(sample, model).rules
Global explanation: explain the whole dataset (e.g. train set, test set), and what they look like for the ground-truth or predicted labels.
from text_explainability import import_data, TokenFrequency, MMDCritic
# Import dataset
env = import_data('./datasets/test.csv', data_cols=['fulltext'], label_cols=['label'])
# Top-k most frequent tokens per label
TokenFrequency(env.dataset).explain(labelprovider=env.labels, explain_model=False, k=3)
# 2 prototypes and 1 criticisms for the dataset
MMDCritic(env.dataset)(n_prototypes=2, n_criticisms=1)
Using text_explainability
- Installation
Installation guide, directly installing it via pip or through the git.
- Example Usage
An extended usage example.
- Explanation Methods Included
Overview of the explanation methods included in
text_explainability
.- text_explainability API reference
A reference to all classes and functions included in the
text_explainability
.
Development
- text_explainability @ GIT
The git includes the open-source code and the most recent development version.
- Changelog
Changes for each version are recorded in the changelog.
- Contributing
Contributors to the open-source project and contribution guidelines.
Extensions
text_explainability
can be extended to also perform sensitivity testing, checking for machine learning model robustness and fairness. The text_sensitivity
package is available through PyPI and fully documented at https://text-sensitivity.readthedocs.io/.
Citation
@misc{text_explainability,
title = {Python package text\_explainability},
author = {Marcel Robeer},
howpublished = {\url{https://git.science.uu.nl/m.j.robeer/text_explainability}},
year = {2021}
}