Marc Plantevit

On GNN explainability with activation rules

Abstract

GNNs are powerful models based on node representation learning that perform particularly well in many machine learning problems related to graphs. The major obstacle to the deployment of GNNs is mostly a problem of societal acceptability and trustworthiness, properties which require making explicit the internal functioning of such models. Here, we propose to mine activation rules in the hidden layers to understand how the GNNs perceive the world. The problem is not to discover activation rules that are individually highly discriminating for an output of the model. Instead, the challenge is to provide a small set of rules that cover all input graphs. To this end, we introduce the subjective activation pattern domain. We define an effective and principled algorithm to enumerate activations rules in each hidden layer. The proposed approach for quantifying the interest of these rules is rooted in information theory and is able to account for background knowledge on the input graph data. The activation rules can then be redescribed thanks to pattern languages involving interpretable features. We show that the activation rules provide insights on the characteristics used by the GNN to classify the graphs. Especially, this allows to identify the hidden features built by the GNN through its different layers. Also, these rules can subsequently be used for explaining GNN decisions. Experiments on both synthetic and real-life datasets show highly competitive performance, with up to 200% improvement in fidelity on explaining graph classification over the SOTA methods.

Continue reading

Discovering and visualizing tactics in table tennis games based on subgroup discovery

By Pierre Duluard, Xinqing Li, Marc Plantevit, Céline Robardet, Romain Vuillemot

2022-09-19

In Machine learning and data mining for sports analytics - 9th international workshop, MLSA 2022

Abstract

We report on preliminary results to automatically identify efficient tactics of elite players in table tennis games. We define such tactics as subgroups of winning strokes which table tennis experts sought to obtain to train players and adapt their strategy during games. We first report on the creation of such subgroups and their ranking by weighted relative accuracy measure (WRAcc). We then report on representation of the subgroups using visualizations that enabled our expert to provide rapid feedback and hence provided us with guidance towards further improvements of our discoveries

Continue reading

Improving the quality of rule-based GNN explanations

By Ataollah Kamal, Elouan Vincent, Marc Plantevit, Céline Robardet

2022-09-12

In Workshop on eXplainable knowledge discovery in data mining. Machine learning and principles and practice of knowledge discovery in databases - international workshops of ECML PKDD 2022, grenoble, france, september 19-23, 2022, proceedings, part I

Abstract

Recent works have proposed to explain GNNs using activation rules. Activation rules allow to capture specific configurations in the embedding space of a given layer that is discriminant for the GNN decision. These rules also catch hidden features of input graphs. This requires to associate these rules to representative graphs. In this paper, we propose on the one hand an analysis of heuristic-based algorithms to extract the activation rules, and on the other hand the use of transport-based optimal graph distances to associate each rule with the most specific graph that triggers them.

Continue reading

Using subgroup discovery to relate odor pleasantness and intensity to peripheral nervous system reactions

By Maelle Moranges, Marc Plantevit, Moustafa Bensafi

2022-07-24

In IEEE Transactions on Affective Computing

Abstract

Activation of the autonomic nervous system is a primary characteristic of human hedonic responses to sensory stimuli. For smells, general tendencies of physiological reactions have been described using classical statistics. However, these physiological variations are generally not quantified precisely; each psychophysiological parameter has very often been studied separately and individual variability was not systematically considered. The current study presents an innovative approach based on data mining, whose goal is to extract knowledge from a dataset. This approach uses a subgroup discovery algorithm which allows extraction of rules that apply to as many olfactory stimuli and individuals as possible. These rules are described by intervals on a set of physiological attributes. Results allowed both quantifying how each physiological parameter relates to odor pleasantness and perceived intensity but also describing the participation of each individual to these rules. This approach can be applied to other fields of affective sciences characterized by complex and heterogeneous datasets.

Continue reading

What does my GNN really capture? On exploring internal GNN representations

By Luca Veyrin-Forrer, Ataollah Kamal, Stefan Duffner, Marc Plantevit, Céline Robardet

2022-07-23

In Proceedings of the 31st international joint conference on artificial intelligence (IJCAI’22)

Abstract

GNNs are efficient for classifying graphs but their internal workings is opaque which limits their field of application. Existing methods for explaining GNN focus on disclosing the relationships between input graphs and the model’s decision. In contrary, the method we propose isolates internal features, hidden in the network layers, which are automatically identified by the GNN to classify graphs. We show that this method makes it possible to know the parts of the input graphs used by GNN with much less bias than the SOTA methods and therefore to provide confidence in the decision process.

Continue reading

Qu’est-ce que mon GNN capture vraiment ? Exploration des représentations internes d’un GNN

By Luca Veyrin-Forrer, Ataollah Kamal, Stefan Duffner, Marc Plantevit, Céline Robardet

2022-03-24

In Extraction et gestion des connaissances, EGC 2022, blois, france, 24 au 28 janvier 2022

Abstract

While existing GNN’s explanation methods explain the decision by studying the output layer, we propose a method that analyzes the hidden layers to identify the neurons that are co-activated for a class. We associate to them a graph.

Continue reading

Electricity price forecasting on the day-ahead market using machine learning

Abstract

The price of electricity on the European market is very volatile. This is due both to its mode of production by different sources, each with its own constraints (volume of production, dependence on the weather, or production inertia), and by the difficulty of its storage. Being able to predict the prices of the next day is an important issue, to allow the development of intelligent uses of electricity. In this article, we investigate the capabilities of different machine learning techniques to accurately predict electricity prices. Specifically, we extend current state-of-the-art approaches by considering previously unused predictive features such as price histories of neighboring countries. We show that these features significantly improve the quality of forecasts, even in the current period when sudden changes are occurring. We also develop an analysis of the contribution of the different features in model prediction using Shap values, in order to shed light on how models make their prediction and to build user confidence in models.

Continue reading

Découverte de sous-groupes de prédictions interprétables pour le triage d’incidents

By Youcef Remil, Anes Bendimerad, Marc Plantevit, Céline Robardet, Mehdi Kaytoue

2022-01-24

In Extraction et gestion des connaissances, EGC 2022, blois, france, 24 au 28 janvier 2022

Abstract

The need for predictive maintenance comes with an increasing number of incidents, where it is imperative to quickly decide which service to contact for corrective actions. Several predictive models have been designed to automate this process, but the efficient models are opaque (say, black boxes). Many approaches have been proposed to locally explain each prediction of such models. However, providing an explanation for every result is not conceivable when it comes to a large number of daily predictions to analyze. In this article we propose a method based on Subgroup Discovery in order to (1) group together objects that share similar explanations and (2) provide a description that characterises each subgroup

Continue reading