Publications

Explorer les débats parlementaires français de la troisième république par leurs sujets

By Marie Puren, Aurélien Pellet

2023-06-01

In Humanistica 2023

Abstract

Cet article compare trois méthodes pour explorer de grands corpus de documents historiques par leurs sujets. Nous travaillons ici sur les débats parlementaires franais de la Troisième République, qui se prêtent particulièrement bien à ce type d’analyse. Après avoir présenté le contexte de cette étude, nous exposons les résultats obtenus avec trois méthodes issues du traitement automatique des langues et appliquées sur des textes publiés entre 1876 et 1914 : l’allocation de Dirichlet latente, les plongements de mots et le Transfer Learning.

Continue reading

L’identification des projets de logiciel libre accessibles aux nouveaux contributeurs

By Paul Hervot, Benoît Crespin

2023-06-01

In EIAH2023 : 11ème conférence sur les environnements informatiques pour l’apprentissage humain

Abstract

FOSS makes an increasing amount of the public and industrial software landscape, notably for its transparency and democratic governance. However, simply publishing the source code of a software does not automatically make it accessible, and many barriers impede new contributors approaching these projects. Through a large-scale software mining of the Software Heritage archive, we test the pertinence of three signals in the identification of accessible FOSS projects for new contributors. Our results show a positive correlation between the number of new contributors of a project successfully bringing their contribution to completion and the presence of contributing guidelines, as well as between that same number and the number of recent unique contributors in the project. Such signals could find a use in the teaching of FOSS practices, helping teachers to select accessible projects for their students.

Continue reading

Les jeux d’argent dans la population carcérale : Pratiques du jeu, trajectoires de joueurs, problématiques d’addiction

Abstract

La recherche exploratoire que nous avons réalisée entre septembre 2021 et mars 2023 sur les pratiques de jeux d’argent et les problématiques d’addiction au jeu dans la population carcérale française conforte les résultats des études internationales déjà menées sur le sujet : une forte prévalence des pratiques de jeux d’argent et des problématiques d’addiction au jeu chez les personnes détenues dans les établissements étudiées ; une possible relation de causalité entre les problèmes de jeu rencontrés par les personnes concernées et le motif de leur incarcération ; une persistance et une diversité des pratiques de jeux d’argent à l’intérieur de la prison malgré l’interdiction et le contrôle plus ou moins stricte dont elles font l’objet ; une dimension récréative de ces pratiques malgré des conduites à risque notamment parmi les joueurs se déclarant problématiques ; un accès au soin limitée pour des joueurs qui aimeraient obtenir un aide spécifique mais qui peuvent parfois se révéler réticents à l’idée de s’engager dans une démarche thérapeutique. Ces conclusions étaient, certes, plus ou moins attendues mais elles n’en demeurent pas moins alarmantes, tant d’un point de vue social que sanitaire. En effet, si certains joueurs ayant connu des problèmes de jeu semblent profiter de leur détention pour arrêter de jouer, voire pour se soigner quand ils en ont l’opportunité, d’autres trouvent des stratégies pour continuer de jouer entre détenus ou sur internet, soit de manière récréative soit pour subvenir à leur besoin, avec les risques que cela comporte. Surtout, on observe la trajectoire particulière d’une partie non négligeable de ces joueurs pour qui leur addiction aux jeux d’argent aurait contribué à les conduire en prison, parfois à plusieurs reprises. Si le caractère criminogène de cette addiction reste à démontrer, ne serait-ce que parce que l’établissement d’un lien de causalité entre les problèmes de jeu des joueurs concernés et leurs problèmes judiciaires dépend de la manière dont ils les perçoivent, on peut néanmoins affirmer qu’il existe effectivement un risque, pour certains joueurs dits problématiques, d’entrer dans une “carrière déviante” pouvant déboucher sur une incarcération, en sachant que ce risque dépend notamment de leurs conditions d’existence, de leur environnement social, de leur pratique du jeu et de leur trajectoire en tant que joueurs.

Continue reading

Linear object detection in document images using multiple object tracking

By Philippe Bernet, Joseph Chazalon, Edwin Carlinet, Alexandre Bourquelot, Élodie Puybareau

2023-06-01

In Proceedings of the international conference on document analysis and recognition (ICDAR 2023)

Abstract

Linear objects convey substantial information about document structure, but are challenging to detect accurately because of degradation (curved, erased) or decoration (doubled, dashed). Many approaches can recover some vector representation, but only one closed-source technique introduced in 1994, based on Kalman filters (a particular case of Multiple Object Tracking algorithm), can perform a pixel-accurate instance segmentation of linear objects and enable to selectively remove them from the original image. We aim at re-popularizing this approach and propose: 1. a framework for accurate instance segmentation of linear objects in document images using Multiple Object Tracking (MOT); 2. document image datasets and metrics which enable both vector- and pixel-based evaluation of linear object detection; 3. performance measures of MOT approaches against modern segment detectors; 4. performance measures of various tracking strategies, exhibiting alternatives to the original Kalman filters approach; and 5. an open-source implementation of a detector which can discriminate instances of curved, erased, dashed, intersecting and/or overlapping linear objects.

Continue reading

Metrics for community dynamics applied to unsupervised attacks detection

By Julien Michel, Pierre Parrend

2023-06-01

In Rencontres des jeunes chercheurs en intelligence artificielle

Abstract

Attack detection in big networks has become a necessity. Yet, with the ever changing threat landscape and massive amount of data to handle, network intrusion detection systems (NIDS) end up being obsolete. Different machine-learning-based solutions have been developed to answer the detection problem for data with evolving statistical distributions. However, no approach has proved to be both scalable and robust to passing time. In this paper, we propose a scalable and unsupervised approach to detect behavioral patterns without prior knowledge on the nature of attacks. For this purpose, we define novel metrics for graph community dynamics and use them as feature with unsupervised detection algorithm on the UGR’16 dataset. The proposed approach improves existing detection algorithms by 285.56% in precision and 222.82% in recall when compared to usual feature extraction (FE) using isolation forest.

Continue reading

Software supply-chain security: Issues and countermeasures

Abstract

Software application development is a complex activity which involves various actors and organizations in what is called the software supply chain. The evolution of the software supply chain led to numerous benefits such as profit maximization, code mutualization, and the optimization of lead times. However, the complexity of the software supply chain results in multiple security issues and attacks because compromises are highly prevalent. An attacker that compromises a single link (e.g., by maliciously modifying the software) in the software supply chain, can harm users of this software and this attack technique is frequently being exploited to attack high profile companies. We can provide a holistic and effective security solution to the software supply chain only if its security state and features are well understood. We discuss how we can achieve strong resilience of the software supply chain to cyberthreats. Next, we propose a holistic end-to-end security approach for the software supply chain.

Continue reading

Learning sentinel-2 reflectance dynamics for data-driven assimilation and forecasting

By Anthony Frion, Lucas Drumetz, Guillaume Tochon, Mauro Dalla Mura, Abdeldjalil Aïssa El Bey

2023-05-29

In Proceedings of the 31th european signal processing conference (EUSIPCO)

Abstract

Over the last few years, massive amounts of satellite multispectral and hyperspectral images covering the Earth’s surface have been made publicly available for scientific purpose, for example through the European Copernicus project. Simultaneously, the development of self-supervised learning (SSL) methods has sparked great interest in the remote sensing community, enabling to learn latent representations from unlabeled data to help treating downstream tasks for which there is few annotated examples, such as interpolation, forecasting or unmixing. Following this line, we train a deep learning model inspired from the Koopman operator theory to model long-term reflectance dynamics in an unsupervised way. We show that this trained model, being differentiable, can be used as a prior for data assimilation in a straightforward way. Our datasets, which are composed of Sentinel-2 multispectral image time series, are publicly released with several levels of treatment.

Continue reading

Languages of higher-dimensional timed automata

By Amazigh Amrane, Hugo Bazille, Emily Clement, Uli Fahrenberg

2023-05-22

In Proceedings of the 45th international conference on application and theory of petri nets and concurrency (PN’24)

Abstract

We present a new language semantics for real-time concurrency. Its operational models are higher-dimensional timed automata (HDTAs), a generalization of both higher-dimensional automata and timed automata. We define languages of HDTAs as sets of interval-timed pomsets with interfaces. As an application, we show that language inclusion of HDTAs is undecidable. On the other hand, using a region construction we can show that untimings of HDTA languages have enough regularity so that untimed language inclusion is decidable.

Continue reading

Forecasting electricity prices: An optimize then predict-based approach

By Léonard Tschora, Erwan Pierre, Marc Plantevit, Céline Robardet

2023-04-10

In Proceedings of the 21st international symposium on intelligent data analysis (IDA’23)

Abstract

We are interested in electricity price forecasting at the European scale. The electricity market is ruled by price regulation mechanisms that make it possible to adjust production to demand, as electricity is difficult to store. These mechanisms ensure the highest price for producers, the lowest price for consumers and a zero energy balance by setting day-ahead prices, i.e. prices for the next 24h. Most studies have focused on learning increasingly sophisticated models to predict the next day’s 24 hourly prices for a given zone. However, the zones are interdependent and this last point has hitherto been largely underestimated. In the following, we show that estimating the energy cross-border transfer by solving an optimization problem and integrating it as input of a model improves the performance of the price forecasting for several zones together.

Continue reading