We introduce and discuss a knowledge-driven distillation approach to explaining black-box models by means of two kinds of interpretable models. The first is perceptron (or threshold) connectives, which enrich knowledge representation languages such as Description Logics with linear operators that serve as a bridge between statistical learning and logical reasoning. The second is Trepan Reloaded, an approach that builds post-hoc explanations of black-box classifiers in the form of decision trees enhanced by domain knowledge. Our aim is, firstly, to target a model-agnostic distillation approach exemplified with these two frameworks, secondly, to study howthese two frameworks interact on a theoretical level, and, thirdly, to investigate use-cases in ML and AI in a comparative manner. Specifically, we envision that user-studies will help determine human understandability of explanations generated using these two frameworks.

Towards Knowledge-driven Distillation and Explanation of Black-box Models

Daniele Porello;
2021

Abstract

We introduce and discuss a knowledge-driven distillation approach to explaining black-box models by means of two kinds of interpretable models. The first is perceptron (or threshold) connectives, which enrich knowledge representation languages such as Description Logics with linear operators that serve as a bridge between statistical learning and logical reasoning. The second is Trepan Reloaded, an approach that builds post-hoc explanations of black-box classifiers in the form of decision trees enhanced by domain knowledge. Our aim is, firstly, to target a model-agnostic distillation approach exemplified with these two frameworks, secondly, to study howthese two frameworks interact on a theoretical level, and, thirdly, to investigate use-cases in ML and AI in a comparative manner. Specifically, we envision that user-studies will help determine human understandability of explanations generated using these two frameworks.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1082518
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact