interpret
Fit interpretable models. Explain blackbox machine learning.
Stars: 6369
InterpretML is an open-source package that incorporates state-of-the-art machine learning interpretability techniques under one roof. With this package, you can train interpretable glassbox models and explain blackbox systems. InterpretML helps you understand your model's global behavior, or understand the reasons behind individual predictions. Interpretability is essential for: - Model debugging - Why did my model make this mistake? - Feature Engineering - How can I improve my model? - Detecting fairness issues - Does my model discriminate? - Human-AI cooperation - How can I understand and trust the model's decisions? - Regulatory compliance - Does my model satisfy legal requirements? - High-risk applications - Healthcare, finance, judicial, ...
README:
InterpretML is an open-source package that incorporates state-of-the-art machine learning interpretability techniques under one roof. With this package, you can train interpretable glassbox models and explain blackbox systems. InterpretML helps you understand your model's global behavior, or understand the reasons behind individual predictions.
Interpretability is essential for:
- Model debugging - Why did my model make this mistake?
- Feature Engineering - How can I improve my model?
- Detecting fairness issues - Does my model discriminate?
- Human-AI cooperation - How can I understand and trust the model's decisions?
- Regulatory compliance - Does my model satisfy legal requirements?
- High-risk applications - Healthcare, finance, judicial, ...
Python 3.7+ | Linux, Mac, Windows
pip install interpret
# OR
conda install -c conda-forge interpretEBM is an interpretable model developed at Microsoft Research*. It uses modern machine learning techniques like bagging, gradient boosting, and automatic interaction detection to breathe new life into traditional GAMs (Generalized Additive Models). This makes EBMs as accurate as state-of-the-art techniques like random forests and gradient boosted trees. However, unlike these blackbox models, EBMs produce exact explanations and are editable by domain experts.
| Dataset/AUROC | Domain | Logistic Regression | Random Forest | XGBoost | Explainable Boosting Machine |
|---|---|---|---|---|---|
| Adult Income | Finance | .907±.003 | .903±.002 | .927±.001 | .928±.002 |
| Heart Disease | Medical | .895±.030 | .890±.008 | .851±.018 | .898±.013 |
| Breast Cancer | Medical | .995±.005 | .992±.009 | .992±.010 | .995±.006 |
| Telecom Churn | Business | .849±.005 | .824±.004 | .828±.010 | .852±.006 |
| Credit Fraud | Security | .979±.002 | .950±.007 | .981±.003 | .981±.003 |
Notebook for reproducing table
| Interpretability Technique | Type |
|---|---|
| Explainable Boosting | glassbox model |
| APLR | glassbox model |
| Decision Tree | glassbox model |
| Decision Rule List | glassbox model |
| Linear/Logistic Regression | glassbox model |
| SHAP Kernel Explainer | blackbox explainer |
| LIME | blackbox explainer |
| Morris Sensitivity Analysis | blackbox explainer |
| Partial Dependence | blackbox explainer |
Let's fit an Explainable Boosting Machine
from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)
# or substitute with LogisticRegression, DecisionTreeClassifier, RuleListClassifier, ...
# EBM supports pandas dataframes, numpy arrays, and handles "string" data natively.Understand the model
from interpret import show
ebm_global = ebm.explain_global()
show(ebm_global)Understand individual predictions
ebm_local = ebm.explain_local(X_test, y_test)
show(ebm_local)And if you have multiple model explanations, compare them
show([logistic_regression_global, decision_tree_global])If you need to keep your data private, use Differentially Private EBMs (see DP-EBMs)
from interpret.privacy import DPExplainableBoostingClassifier, DPExplainableBoostingRegressor
dp_ebm = DPExplainableBoostingClassifier(epsilon=1, delta=1e-5) # Specify privacy parameters
dp_ebm.fit(X_train, y_train)
show(dp_ebm.explain_global()) # Identical function calls to standard EBMsFor more information, see the documentation.
EBMs include pairwise interactions by default. For 3-way interactions and higher see this notebook: https://interpret.ml/docs/python/examples/custom-interactions.html
Interpret EBMs can be fit on datasets with 100 million samples in several hours. For larger workloads consider using distributed EBMs on Azure SynapseML: classification EBMs and regression EBMs
InterpretML was originally created by (equal contributions): Samuel Jenkins, Harsha Nori, Paul Koch, and Rich Caruana
EBMs are fast derivative of GA2M, invented by: Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker
Many people have supported us along the way. Check out ACKNOWLEDGEMENTS.md!
We also build on top of many great packages. Please check them out!
plotly | dash | scikit-learn | lime | shap | salib | skope-rules | treeinterpreter | gevent | joblib | pytest | jupyter
InterpretML
"InterpretML: A Unified Framework for Machine Learning Interpretability" (H. Nori, S. Jenkins, P. Koch, and R. Caruana 2019)
@article{nori2019interpretml,
title={InterpretML: A Unified Framework for Machine Learning Interpretability},
author={Nori, Harsha and Jenkins, Samuel and Koch, Paul and Caruana, Rich},
journal={arXiv preprint arXiv:1909.09223},
year={2019}
}
Paper link
Explainable Boosting
"Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission" (R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad 2015)
@inproceedings{caruana2015intelligible,
title={Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission},
author={Caruana, Rich and Lou, Yin and Gehrke, Johannes and Koch, Paul and Sturm, Marc and Elhadad, Noemie},
booktitle={Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
pages={1721--1730},
year={2015},
organization={ACM}
}
Paper link
"Accurate intelligible models with pairwise interactions" (Y. Lou, R. Caruana, J. Gehrke, and G. Hooker 2013)
@inproceedings{lou2013accurate,
title={Accurate intelligible models with pairwise interactions},
author={Lou, Yin and Caruana, Rich and Gehrke, Johannes and Hooker, Giles},
booktitle={Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining},
pages={623--631},
year={2013},
organization={ACM}
}
Paper link
"Intelligible models for classification and regression" (Y. Lou, R. Caruana, and J. Gehrke 2012)
@inproceedings{lou2012intelligible,
title={Intelligible models for classification and regression},
author={Lou, Yin and Caruana, Rich and Gehrke, Johannes},
booktitle={Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining},
pages={150--158},
year={2012},
organization={ACM}
}
Paper link
"Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values" (Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana 2022)
@article{wang2022interpretability,
title={Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values},
author={Wang, Zijie J and Kale, Alex and Nori, Harsha and Stella, Peter and Nunnally, Mark E and Chau, Duen Horng and Vorvoreanu, Mihaela and Vaughan, Jennifer Wortman and Caruana, Rich},
journal={arXiv preprint arXiv:2206.15465},
year={2022}
}
Paper link
"Axiomatic Interpretability for Multiclass Additive Models" (X. Zhang, S. Tan, P. Koch, Y. Lou, U. Chajewska, and R. Caruana 2019)
@inproceedings{zhang2019axiomatic,
title={Axiomatic Interpretability for Multiclass Additive Models},
author={Zhang, Xuezhou and Tan, Sarah and Koch, Paul and Lou, Yin and Chajewska, Urszula and Caruana, Rich},
booktitle={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
pages={226--234},
year={2019},
organization={ACM}
}
Paper link
"Distill-and-compare: auditing black-box models using transparent model distillation" (S. Tan, R. Caruana, G. Hooker, and Y. Lou 2018)
@inproceedings{tan2018distill,
title={Distill-and-compare: auditing black-box models using transparent model distillation},
author={Tan, Sarah and Caruana, Rich and Hooker, Giles and Lou, Yin},
booktitle={Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society},
pages={303--310},
year={2018},
organization={ACM}
}
Paper link
"Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models" (B. Lengerich, S. Tan, C. Chang, G. Hooker, R. Caruana 2019)
@article{lengerich2019purifying,
title={Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models},
author={Lengerich, Benjamin and Tan, Sarah and Chang, Chun-Hao and Hooker, Giles and Caruana, Rich},
journal={arXiv preprint arXiv:1911.04974},
year={2019}
}
Paper link
"Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning" (H. Kaur, H. Nori, S. Jenkins, R. Caruana, H. Wallach, J. Wortman Vaughan 2020)
@inproceedings{kaur2020interpreting,
title={Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning},
author={Kaur, Harmanpreet and Nori, Harsha and Jenkins, Samuel and Caruana, Rich and Wallach, Hanna and Wortman Vaughan, Jennifer},
booktitle={Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems},
pages={1--14},
year={2020}
}
Paper link
"How Interpretable and Trustworthy are GAMs?" (C. Chang, S. Tan, B. Lengerich, A. Goldenberg, R. Caruana 2020)
@article{chang2020interpretable,
title={How Interpretable and Trustworthy are GAMs?},
author={Chang, Chun-Hao and Tan, Sarah and Lengerich, Ben and Goldenberg, Anna and Caruana, Rich},
journal={arXiv preprint arXiv:2006.06466},
year={2020}
}
Paper link
Differential Privacy
"Accuracy, Interpretability, and Differential Privacy via Explainable Boosting" (H. Nori, R. Caruana, Z. Bu, J. Shen, J. Kulkarni 2021)
@inproceedings{pmlr-v139-nori21a,
title = {Accuracy, Interpretability, and Differential Privacy via Explainable Boosting},
author = {Nori, Harsha and Caruana, Rich and Bu, Zhiqi and Shen, Judy Hanwen and Kulkarni, Janardhan},
booktitle = {Proceedings of the 38th International Conference on Machine Learning},
pages = {8227--8237},
year = {2021},
volume = {139},
series = {Proceedings of Machine Learning Research},
publisher = {PMLR}
}
Paper link
LIME
"Why should i trust you?: Explaining the predictions of any classifier" (M. T. Ribeiro, S. Singh, and C. Guestrin 2016)
@inproceedings{ribeiro2016should,
title={Why should i trust you?: Explaining the predictions of any classifier},
author={Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos},
booktitle={Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining},
pages={1135--1144},
year={2016},
organization={ACM}
}
Paper link
SHAP
"A Unified Approach to Interpreting Model Predictions" (S. M. Lundberg and S.-I. Lee 2017)
@incollection{NIPS2017_7062,
title = {A Unified Approach to Interpreting Model Predictions},
author = {Lundberg, Scott M and Lee, Su-In},
booktitle = {Advances in Neural Information Processing Systems 30},
editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett},
pages = {4765--4774},
year = {2017},
publisher = {Curran Associates, Inc.},
url = {https://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf}
}
Paper link
"Consistent individualized feature attribution for tree ensembles" (Lundberg, Scott M and Erion, Gabriel G and Lee, Su-In 2018)
@article{lundberg2018consistent,
title={Consistent individualized feature attribution for tree ensembles},
author={Lundberg, Scott M and Erion, Gabriel G and Lee, Su-In},
journal={arXiv preprint arXiv:1802.03888},
year={2018}
}
Paper link
"Explainable machine-learning predictions for the prevention of hypoxaemia during surgery" (S. M. Lundberg et al. 2018)
@article{lundberg2018explainable,
title={Explainable machine-learning predictions for the prevention of hypoxaemia during surgery},
author={Lundberg, Scott M and Nair, Bala and Vavilala, Monica S and Horibe, Mayumi and Eisses, Michael J and Adams, Trevor and Liston, David E and Low, Daniel King-Wai and Newman, Shu-Fang and Kim, Jerry and others},
journal={Nature Biomedical Engineering},
volume={2},
number={10},
pages={749},
year={2018},
publisher={Nature Publishing Group}
}
Paper link
Sensitivity Analysis
"SALib: An open-source Python library for Sensitivity Analysis" (J. D. Herman and W. Usher 2017)
@article{herman2017salib,
title={SALib: An open-source Python library for Sensitivity Analysis.},
author={Herman, Jonathan D and Usher, Will},
journal={J. Open Source Software},
volume={2},
number={9},
pages={97},
year={2017}
}
Paper link
"Factorial sampling plans for preliminary computational experiments" (M. D. Morris 1991)
@article{morris1991factorial,
title={},
author={Morris, Max D},
journal={Technometrics},
volume={33},
number={2},
pages={161--174},
year={1991},
publisher={Taylor \& Francis Group}
}
Paper link
Partial Dependence
"Greedy function approximation: a gradient boosting machine" (J. H. Friedman 2001)
@article{friedman2001greedy,
title={Greedy function approximation: a gradient boosting machine},
author={Friedman, Jerome H},
journal={Annals of statistics},
pages={1189--1232},
year={2001},
publisher={JSTOR}
}
Paper link
Open Source Software
"Scikit-learn: Machine learning in Python" (F. Pedregosa et al. 2011)
@article{pedregosa2011scikit,
title={Scikit-learn: Machine learning in Python},
author={Pedregosa, Fabian and Varoquaux, Ga{\"e}l and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and others},
journal={Journal of machine learning research},
volume={12},
number={Oct},
pages={2825--2830},
year={2011}
}
Paper link
"Collaborative data science" (Plotly Technologies Inc. 2015)
@online{plotly,
author = {Plotly Technologies Inc.},
title = {Collaborative data science},
publisher = {Plotly Technologies Inc.},
address = {Montreal, QC},
year = {2015},
url = {https://plot.ly}
}
Link
"Joblib: running python function as pipeline jobs" (G. Varoquaux and O. Grisel 2009)
@article{varoquaux2009joblib,
title={Joblib: running python function as pipeline jobs},
author={Varoquaux, Ga{\"e}l and Grisel, O},
journal={packages. python. org/joblib},
year={2009}
}
Link
- The Science Behind InterpretML: Explainable Boosting Machine
- How to Explain Models with InterpretML Deep Dive
- Black-Box and Glass-Box Explanation in Machine Learning
- Explainable AI explained! By-design interpretable models with Microsofts InterpretML
- Interpreting Machine Learning Models with InterpretML
- Machine Learning Model Interpretability using AzureML & InterpretML (Explainable Boosting Machine)
- A Case Study of Using Explainable Boosting Machines
- From SHAP to EBM: Explain your Gradient Boosting Models in Python
- Rich Caruana – Friends Don’t Let Friends Deploy Black-Box Models
- Machine Learning Interpretability in Banking: Why It Matters and How Explainable Boosting Machines Can Help
- Interpretable Machine Learning – Increase Trust and Eliminate Bias
- Enhancing Trust in Credit Risk Models: A Comparative Analysis of EBMs and GBMs
- Explainable AI: unlocking value in FEC operations
- Interpretable or Accurate? Why Not Both?
- The Explainable Boosting Machine. As accurate as gradient boosting, as interpretable as linear regression.
- Exploring explainable boosting machines
- Performance And Explainability With EBM
- InterpretML: Another Way to Explain Your Model
- A gentle introduction to GA2Ms, a white box model
- Explaining Non-Parametric Additive Models
- Model Interpretation with Microsoft’s Interpret ML
- Explaining Model Pipelines With InterpretML
- Explain Your Model with Microsoft’s InterpretML
- On Model Explainability: From LIME, SHAP, to Explainable Boosting
- Dealing with Imbalanced Data (Mortgage loans defaults)
- The right way to compute your Shapley Values
- The Art of Sprezzatura for Machine Learning
- Mixing Art into the Science of Model Explainability
- Automatic Piecewise Linear Regression
- MCTS EDA which makes sense
- Explainable Boosting machines for Tabular data
- Challenging the Performance-Interpretability Trade-off: An Evaluation of Interpretable Machine Learning Models
- GAMFORMER: In-context Learning for Generalized Additive Models
- Glass Box Machine Learning and Corporate Bond Returns
- Data Science with LLMs and Interpretable Models
- DimVis: Interpreting Visual Clusters in Dimensionality Reduction With Explainable Boosting Machine
- Distill knowledge of additive tree models into generalized linear models
- Explainable Boosting Machines with Sparsity - Maintaining Explainability in High-Dimensional Settings
- Cost of Explainability in AI: An Example with Credit Scoring Models
- Interpretable Machine Learning Leverages Proteomics to Improve Cardiovascular Disease Risk Prediction and Biomarker Identification
- Interpretable Additive Tabular Transformer Networks
- Signature Informed Sampling for Transcriptomic Data
- Interpretable Survival Analysis for Heart Failure Risk Prediction
- LLMs Understand Glass-Box Models, Discover Surprises, and Suggest Repairs
- Model Interpretability in Credit Insurance
- Federated Boosted Decision Trees with Differential Privacy
- Differentially private and explainable boosting machine with enhanced utility
- Balancing Explainability and Privacy in Bank Failure Prediction: A Differentially Private Glass-Box Approach
- GAM(E) CHANGER OR NOT? AN EVALUATION OF INTERPRETABLE MACHINE LEARNING MODELS
- GAM Coach: Towards Interactive and User-centered Algorithmic Recourse
- Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?
- Practice and Challenges in Building a Universal Search Quality Metric
- Explaining Phishing Attacks: An XAI Approach to Enhance User Awareness and Trust
- Revealing the Galaxy-Halo Connection Through Machine Learning
- How the Galaxy–Halo Connection Depends on Large-Scale Environment
- Explainable Artificial Intelligence for COVID-19 Diagnosis Through Blood Test Variables
- A diagnostic support system based on interpretable machine learning and oscillometry for accurate diagnosis of respiratory dysfunction in silicosis
- Using Explainable Boosting Machines (EBMs) to Detect Common Flaws in Data
- Differentially Private Gradient Boosting on Linear Learners for Tabular Data Analysis
- Differentially private and explainable boosting machine with enhanced utility
- Concrete compressive strength prediction using an explainable boosting machine model
- Proxy endpoints - bridging clinical trials and real world data
- Machine Learning Model Reveals Determinators for Admission to Acute Mental Health Wards From Emergency Department Presentations
- Towards Cleaner Cities: Estimating Vehicle-Induced PM2.5 with Hybrid EBM-CMA-ES Modeling
- Predicting Robotic Hysterectomy Incision Time: Optimizing Surgical Scheduling with Machine Learning
- Using machine learning to assist decision making in the assessment of mental health patients presenting to emergency departments
- Proposing an inherently interpretable machine learning model for shear strength prediction of reinforced concrete beams with stirrups
- A hybrid machine learning approach for predicting fiber-reinforced polymer-concrete interface bond strength
- Using explainable machine learning and fitbit data to investigate predictors of adolescent obesity
- Interpretable Predictive Value of Including HDL-2b and HDL-3 in an Explainable Boosting Machine Model for Multiclass Classification of Coronary Artery Stenosis Severity in Acute Myocardial Infarction Patients
- Estimate Deformation Capacity of Non-Ductile RC Shear Walls Using Explainable Boosting Machine
- Introducing the Rank-Biased Overlap as Similarity Measure for Feature Importance in Explainable Machine Learning: A Case Study on Parkinson’s Disease
- Targeting resources efficiently and justifiably by combining causal machine learning and theory
- Extractive Text Summarization Using Generalized Additive Models with Interactions for Sentence Selection
- Death by Round Numbers: Glass-Box Machine Learning Uncovers Biases in Medical Practice
- Post-Hoc Interpretation of Transformer Hyperparameters with Explainable Boosting Machines
- Interpretable machine learning for predicting pathologic complete response in patients treated with chemoradiation therapy for rectal adenocarcinoma
- Exploring the Balance between Interpretability and Performance with carefully designed Constrainable Neural Additive Models
- Estimating Discontinuous Time-Varying Risk Factors and Treatment Benefits for COVID-19 with Interpretable ML
- StratoMod: Predicting sequencing and variant calling errors with interpretable machine learning
- Interpretable machine learning algorithms to predict leaf senescence date of deciduous trees
- Comparing Explainable Machine Learning Approaches With Traditional Statistical Methods for Evaluating Stroke Risk Models: Retrospective Cohort Study
- An Explainable AI Approach using Graph Learning to Predict ICU Length of Stay
- Cross Feature Selection to Eliminate Spurious Interactions and Single Feature Dominance Explainable Boosting Machines
- Multi-Objective Optimization of Performance and Interpretability of Tabular Supervised Machine Learning Models
- An explainable model to support the decision about the therapy protocol for AML
- Assessing wind field characteristics along the airport runway glide slope: an explainable boosting machine-assisted wind tunnel study
- Trustworthy Academic Risk Prediction with Explainable Boosting Machines
- Binary ECG Classification Using Explainable Boosting Machines for IoT Edge Devices
- Explainable artificial intelligence toward usable and trustworthy computer-aided diagnosis of multiple sclerosis from Optical Coherence Tomography
- An Interpretable Machine Learning Model with Deep Learning-based Imaging Biomarkers for Diagnosis of Alzheimer’s Disease
- Prediction of Alzheimer Disease on the DARWIN Dataset with Dimensionality Reduction and Explainability Techniques
- Explainable Boosting Machine for Predicting Alzheimer’s Disease from MRI Hippocampal Subfields
- Comparing explainable machine learning approaches with traditional statistical methods for evaluating stroke risk models: retrospective cohort study
- Explainable Artificial Intelligence for Cotton Yield Prediction With Multisource Data
- Preoperative detection of extraprostatic tumor extension in patients with primary prostate cancer utilizing
- Monotone Tree-Based GAMI Models by Adapting XGBoost
- Neural Graphical Models
- FAST: An Optimization Framework for Fast Additive Segmentation in Transparent ML
- The Quantitative Analysis of Explainable AI for Network Anomaly Detection
- Enhancing Predictive Battery Maintenance Through the Use of Explainable Boosting Machine
- Improved Differentially Private Regression via Gradient Boosting
- Explainable Artificial Intelligence in Job Recommendation Systems
- Diagnosis uncertain models for medical risk prediction
- Extending Explainable Boosting Machines to Scientific Image Data
- Pest Presence Prediction Using Interpretable Machine Learning
- Key Thresholds and Relative Contributions of Knee Geometry, Anteroposterior Laxity, and Body Weight as Risk Factors for Noncontact ACL Injury
- A clinical prediction model for 10-year risk of self-reported osteoporosis diagnosis in pre- and perimenopausal women
- epitope1D: Accurate Taxonomy-Aware B-Cell Linear Epitope Prediction
- Explainable Boosting Machines for Slope Failure Spatial Predictive Modeling
- Micromodels for Efficient, Explainable, and Reusable Systems: A Case Study on Mental Health
- Identifying main and interaction effects of risk factors to predict intensive care admission in patients hospitalized with COVID-19
- Leveraging interpretable machine learning in intensive care
- Development of prediction models for one-year brain tumour survival using machine learning: a comparison of accuracy and interpretability
- Using Interpretable Machine Learning to Predict Maternal and Fetal Outcomes
- Calibrate: Interactive Analysis of Probabilistic Model Output
- Neural Additive Models: Interpretable Machine Learning with Neural Nets
- TabSRA: An Attention based Self-Explainable Model for Tabular Learning
- Evaluating the Efficacy of Instance Incremental vs. Batch Learning in Delayed Label Environments: An Empirical Study on Tabular Data Streaming for Fraud Detection
- Improving Neural Additive Models with Bayesian Principles
- NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning
- Scalable Interpretability via Polynomials
- Polynomial Threshold Functions of Bounded Tree-Width: Some Explainability and Complexity Aspects
- Neural Basis Models for Interpretability
- ILMART: Interpretable Ranking with Constrained LambdaMART
- Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki Disease
- Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression
- Proxy Endpoints - Bridging clinical trials and real world data
- Application of boosted trees to the prognosis prediction of COVID‐19
- Explainable Gradient Boosting for Corporate Crisis Forecasting in Italian Businesses
- Revisiting differentially private XGBoost: Are random decision trees really better than greedy ones?
- Investigating Trust in Human-Machine Learning Collaboration: A Pilot Study on Estimating Public Anxiety from Speech
- pureGAM: Learning an Inherently Pure Additive Model
- GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions
- Interpretable Machine Learning based on Functional ANOVA Framework: Algorithms and Comparisons
- Using Model-Based Trees with Boosting to Fit Low-Order Functional ANOVA Models
- Interpretable generalized additive neural networks
- A Concept and Argumentation based Interpretable Model in High Risk Domains
- Analyzing the Differences between Professional and Amateur Esports through Win Probability
- Explainable machine learning with pairwise interactions for the classification of Parkinson’s disease and SWEDD from clinical and imaging features
- Interpretable Prediction of Goals in Soccer
- Extending the Tsetlin Machine with Integer-Weighted Clauses for Increased Interpretability
- In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction
- From Shapley Values to Generalized Additive Models and back
- Developing A Visual-Interactive Interface for Electronic Health Record Labeling
- Development and Validation of an Interpretable 3-day Intensive Care Unit Readmission Prediction Model Using Explainable Boosting Machines
- Death by Round Numbers and Sharp Thresholds: How to Avoid Dangerous AI EHR Recommendations
- Building a predictive model to identify clinical indicators for COVID-19 using machine learning method
- Using Innovative Machine Learning Methods to Screen and Identify Predictors of Congenital Heart Diseases
- Impact of Accuracy on Model Interpretations
- Machine Learning Algorithms for Identifying Dependencies in OT Protocols
- Causal Understanding of Why Users Share Hate Speech on Social Media
- Explainable Boosting Machine: A Contemporary Glass-Box Model to Analyze Work Zone-Related Road Traffic Crashes
- Efficient and Interpretable Traffic Destination Prediction using Explainable Boosting Machines
- Explainable Artificial Intelligence Paves the Way in Precision Diagnostics and Biomarker Discovery for the Subclass of Diabetic Retinopathy in Type 2 Diabetics
- A proposed tree-based explainable artificial intelligence approach for the prediction of angina pectoris
- Explainable Boosting Machine: A Contemporary Glass-Box Strategy for the Assessment of Wind Shear Severity in the Runway Vicinity Based on the Doppler Light Detection and Ranging Data
- On the Physical Nature of Lya Transmission Spikes in High Redshift Quasar Spectra
- GRAND-SLAMIN’ Interpretable Additive Modeling with Structural Constraints
- Identification of groundwater potential zones in data-scarce mountainous region using explainable machine learning
- Explainable Classification Techniques for Quantum Dot Device Measurements
- Machine Learning for High-Risk Applications
- Interpretable Machine Learning with Python
- Explainable Artificial Intelligence: An Introduction to Interpretable Machine Learning
- Applied Machine Learning Explainability Techniques
- The eXplainable A.I.: With Python examples
- Platform and Model Design for Responsible AI: Design and build resilient, private, fair, and transparent machine learning models
- Explainable AI Recipes
- Ensemble Methods for Machine Learning
- EBM to Onnx converter by SoftAtHome
- EBM to SQL converter - ML 2 SQL
- EBM to PMML converter - SkLearn2PMML
- EBM visual editor - GAM Changer
- Interpreting Visual Clusters in Dimensionality Reduction - DimVis
There are multiple ways to get in touch:
- Email us at [email protected]
- Or, feel free to raise a GitHub issue
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for interpret
Similar Open Source Tools
interpret
InterpretML is an open-source package that incorporates state-of-the-art machine learning interpretability techniques under one roof. With this package, you can train interpretable glassbox models and explain blackbox systems. InterpretML helps you understand your model's global behavior, or understand the reasons behind individual predictions. Interpretability is essential for: - Model debugging - Why did my model make this mistake? - Feature Engineering - How can I improve my model? - Detecting fairness issues - Does my model discriminate? - Human-AI cooperation - How can I understand and trust the model's decisions? - Regulatory compliance - Does my model satisfy legal requirements? - High-risk applications - Healthcare, finance, judicial, ...
Slow_Thinking_with_LLMs
STILL is an open-source project exploring slow-thinking reasoning systems, focusing on o1-like reasoning systems. The project has released technical reports on enhancing LLM reasoning with reward-guided tree search algorithms and implementing slow-thinking reasoning systems using an imitate, explore, and self-improve framework. The project aims to replicate the capabilities of industry-level reasoning systems by fine-tuning reasoning models with long-form thought data and iteratively refining training datasets.
Medical_Image_Analysis
The Medical_Image_Analysis repository focuses on X-ray image-based medical report generation using large language models. It provides pre-trained models and benchmarks for CheXpert Plus dataset, context sample retrieval for X-ray report generation, and pre-training on high-definition X-ray images. The goal is to enhance diagnostic accuracy and reduce patient wait times by improving X-ray report generation through advanced AI techniques.
vector-search-class-notes
The 'vector-search-class-notes' repository contains class materials for a course on Long Term Memory in AI, focusing on vector search and databases. The course covers theoretical foundations and practical implementation of vector search applications, algorithms, and systems. It explores the intersection of Artificial Intelligence and Database Management Systems, with topics including text embeddings, image embeddings, low dimensional vector search, dimensionality reduction, approximate nearest neighbor search, clustering, quantization, and graph-based indexes. The repository also includes information on the course syllabus, project details, selected literature, and contributions from industry experts in the field.
llm-course
The LLM course is divided into three parts: 1. 🧩 **LLM Fundamentals** covers essential knowledge about mathematics, Python, and neural networks. 2. 🧑🔬 **The LLM Scientist** focuses on building the best possible LLMs using the latest techniques. 3. 👷 **The LLM Engineer** focuses on creating LLM-based applications and deploying them. For an interactive version of this course, I created two **LLM assistants** that will answer questions and test your knowledge in a personalized way: * 🤗 **HuggingChat Assistant**: Free version using Mixtral-8x7B. * 🤖 **ChatGPT Assistant**: Requires a premium account. ## 📝 Notebooks A list of notebooks and articles related to large language models. ### Tools | Notebook | Description | Notebook | |----------|-------------|----------| | 🧐 LLM AutoEval | Automatically evaluate your LLMs using RunPod |  | | 🥱 LazyMergekit | Easily merge models using MergeKit in one click. |  | | 🦎 LazyAxolotl | Fine-tune models in the cloud using Axolotl in one click. |  | | ⚡ AutoQuant | Quantize LLMs in GGUF, GPTQ, EXL2, AWQ, and HQQ formats in one click. |  | | 🌳 Model Family Tree | Visualize the family tree of merged models. |  | | 🚀 ZeroSpace | Automatically create a Gradio chat interface using a free ZeroGPU. |  |
oat
Oat is a simple and efficient framework for running online LLM alignment algorithms. It implements a distributed Actor-Learner-Oracle architecture, with components optimized using state-of-the-art tools. Oat simplifies the experimental pipeline of LLM alignment by serving an Oracle online for preference data labeling and model evaluation. It provides a variety of oracles for simulating feedback and supports verifiable rewards. Oat's modular structure allows for easy inheritance and modification of classes, enabling rapid prototyping and experimentation with new algorithms. The framework implements cutting-edge online algorithms like PPO for math reasoning and various online exploration algorithms.
HuixiangDou2
HuixiangDou2 is a robustly optimized GraphRAG approach that integrates multiple open-source projects to improve performance in graph-based augmented generation. It conducts comparative experiments and achieves a significant score increase, leading to a GraphRAG implementation with recognized performance. The repository provides code improvements, dense retrieval for querying entities and relationships, real domain knowledge testing, and impact analysis on accuracy.
Taiyi-LLM
Taiyi (太一) is a bilingual large language model fine-tuned for diverse biomedical tasks. It aims to facilitate communication between healthcare professionals and patients, provide medical information, and assist in diagnosis, biomedical knowledge discovery, drug development, and personalized healthcare solutions. The model is based on the Qwen-7B-base model and has been fine-tuned using rich bilingual instruction data. It covers tasks such as question answering, biomedical dialogue, medical report generation, biomedical information extraction, machine translation, title generation, text classification, and text semantic similarity. The project also provides standardized data formats, model training details, model inference guidelines, and overall performance metrics across various BioNLP tasks.
GOLEM
GOLEM is an open-source AI framework focused on optimization and learning of structured graph-based models using meta-heuristic methods. It emphasizes the potential of meta-heuristics in complex problem spaces where gradient-based methods are not suitable, and the importance of structured models in various problem domains. The framework offers features like structured model optimization, metaheuristic methods, multi-objective optimization, constrained optimization, extensibility, interpretability, and reproducibility. It can be applied to optimization problems represented as directed graphs with defined fitness functions. GOLEM has applications in areas like AutoML, Bayesian network structure search, differential equation discovery, geometric design, and neural architecture search. The project structure includes packages for core functionalities, adapters, graph representation, optimizers, genetic algorithms, utilities, serialization, visualization, examples, and testing. Contributions are welcome, and the project is supported by ITMO University's Research Center Strong Artificial Intelligence in Industry.
awesome-hallucination-detection
This repository provides a curated list of papers, datasets, and resources related to the detection and mitigation of hallucinations in large language models (LLMs). Hallucinations refer to the generation of factually incorrect or nonsensical text by LLMs, which can be a significant challenge for their use in real-world applications. The resources in this repository aim to help researchers and practitioners better understand and address this issue.
OREAL
OREAL is a reinforcement learning framework designed for mathematical reasoning tasks, aiming to achieve optimal performance through outcome reward-based learning. The framework utilizes behavior cloning, reshaping rewards, and token-level reward models to address challenges in sparse rewards and partial correctness. OREAL has achieved significant results, with a 7B model reaching 94.0 pass@1 accuracy on MATH-500 and surpassing previous 32B models. The tool provides training tutorials and Hugging Face model repositories for easy access and implementation.
MMMU
MMMU is a benchmark designed to evaluate multimodal models on college-level subject knowledge tasks, covering 30 subjects and 183 subfields with 11.5K questions. It focuses on advanced perception and reasoning with domain-specific knowledge, challenging models to perform tasks akin to those faced by experts. The evaluation of various models highlights substantial challenges, with room for improvement to stimulate the community towards expert artificial general intelligence (AGI).
agents
Agents 2.0 is a framework for training language agents using symbolic learning, inspired by connectionist learning for neural nets. It implements main components of connectionist learning like back-propagation and gradient-based weight update in the context of agent training using language-based loss, gradients, and weights. The framework supports optimizing multi-agent systems and allows multiple agents to take actions in one node.
pytorch-forecasting
PyTorch Forecasting is a PyTorch-based package for time series forecasting with state-of-the-art network architectures. It offers a high-level API for training networks on pandas data frames and utilizes PyTorch Lightning for scalable training on GPUs and CPUs. The package aims to simplify time series forecasting with neural networks by providing a flexible API for professionals and default settings for beginners. It includes a timeseries dataset class, base model class, multiple neural network architectures, multi-horizon timeseries metrics, and hyperparameter tuning with optuna. PyTorch Forecasting is built on pytorch-lightning for easy training on various hardware configurations.
Awesome-LLM-in-Social-Science
This repository compiles a list of academic papers that evaluate, align, simulate, and provide surveys or perspectives on the use of Large Language Models (LLMs) in the field of Social Science. The papers cover various aspects of LLM research, including assessing their alignment with human values, evaluating their capabilities in tasks such as opinion formation and moral reasoning, and exploring their potential for simulating social interactions and addressing issues in diverse fields of Social Science. The repository aims to provide a comprehensive resource for researchers and practitioners interested in the intersection of LLMs and Social Science.
only_train_once
Only Train Once (OTO) is an automatic, architecture-agnostic DNN training and compression framework that allows users to train a general DNN from scratch or a pretrained checkpoint to achieve high performance and slimmer architecture simultaneously in a one-shot manner without fine-tuning. The framework includes features for automatic structured pruning and erasing operators, as well as hybrid structured sparse optimizers for efficient model compression. OTO provides tools for pruning zero-invariant group partitioning, constructing pruned models, and visualizing pruning and erasing dependency graphs. It supports the HESSO optimizer and offers a sanity check for compliance testing on various DNNs. The repository also includes publications, installation instructions, quick start guides, and a roadmap for future enhancements and collaborations.
For similar tasks
interpret
InterpretML is an open-source package that incorporates state-of-the-art machine learning interpretability techniques under one roof. With this package, you can train interpretable glassbox models and explain blackbox systems. InterpretML helps you understand your model's global behavior, or understand the reasons behind individual predictions. Interpretability is essential for: - Model debugging - Why did my model make this mistake? - Feature Engineering - How can I improve my model? - Detecting fairness issues - Does my model discriminate? - Human-AI cooperation - How can I understand and trust the model's decisions? - Regulatory compliance - Does my model satisfy legal requirements? - High-risk applications - Healthcare, finance, judicial, ...
llm_aigc
The llm_aigc repository is a comprehensive resource for everything related to llm (Large Language Models) and aigc (AI Governance and Control). It provides detailed information, resources, and tools for individuals interested in understanding and working with large language models and AI governance and control. The repository covers a wide range of topics including model training, evaluation, deployment, ethics, and regulations in the AI field.
docker-aio
The docker-aio repository provides an accelerated mirror service for Docker users, allowing them to speed up image pulls by replacing original domains with corresponding accelerated domains. Users in Asia are advised to comply with local laws and regulations when using this service. The repository offers installation scripts and instructions on how to modify Docker configurations to utilize the accelerated mirrors effectively.
xaitk-saliency
The `xaitk-saliency` package is an open source Explainable AI (XAI) framework for visual saliency algorithm interfaces and implementations, designed for analytics and autonomy applications. It provides saliency algorithms for various image understanding tasks such as image classification, image similarity, object detection, and reinforcement learning. The toolkit targets data scientists and developers who aim to incorporate visual saliency explanations into their workflow or product, offering both direct accessibility for experimentation and modular integration into systems and applications through Strategy and Adapter patterns. The package includes documentation, examples, and a demonstration tool for visual saliency generation in a user-interface.
For similar jobs
lollms-webui
LoLLMs WebUI (Lord of Large Language Multimodal Systems: One tool to rule them all) is a user-friendly interface to access and utilize various LLM (Large Language Models) and other AI models for a wide range of tasks. With over 500 AI expert conditionings across diverse domains and more than 2500 fine tuned models over multiple domains, LoLLMs WebUI provides an immediate resource for any problem, from car repair to coding assistance, legal matters, medical diagnosis, entertainment, and more. The easy-to-use UI with light and dark mode options, integration with GitHub repository, support for different personalities, and features like thumb up/down rating, copy, edit, and remove messages, local database storage, search, export, and delete multiple discussions, make LoLLMs WebUI a powerful and versatile tool.
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.
minio
MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.
mage-ai
Mage is an open-source data pipeline tool for transforming and integrating data. It offers an easy developer experience, engineering best practices built-in, and data as a first-class citizen. Mage makes it easy to build, preview, and launch data pipelines, and provides observability and scaling capabilities. It supports data integrations, streaming pipelines, and dbt integration.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.
airbyte
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's no-code Connector Builder or low-code CDK. Airbyte is used by data engineers and analysts at companies of all sizes to build and manage their data pipelines.
labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.



