Joseph Chazalon

ICDAR 2024 competition on historical map text detection, recognition, and linking

Abstract

Text on digitized historical maps contains valuable information, e.g., providing georeferenced political and cultural context. The goal of the ICDAR 2024 MapText Competition is to benchmark methods that automatically extract textual content on historical maps (e.g., place names) and connect words to form location phrases. The competition features two primary tasks—text detection and end-to-end text recognition—each with a secondary task of linking words into phrase blocks. Submissions are evaluated on two data sets: 1) David Rumsey Historical Map Collection which contains 936 map images covering 80 regions and 183 distinct publication years (from 1623 to 2012); 2) French Land Registers (created during the 19th century) which contains 145 map images of 50 French cities and towns. The competition received 44 submissions among all tasks. This report presents the motivation for the competition, the tasks, the evaluation metrics, and the submission analysis.

Continue reading

Weakly supervised training for hologram verification in identity documents

By Glen Pouliquen, Guillaume Chiron, Joseph Chazalon, Thierry Géraud, Ahmad Montaser Awal

2024-04-25

In The 18th international conference on document analysis and recognition (ICDAR 2024)

Abstract

We propose a method to remotely verify the authenticity of Optically Variable Devices (OVDs), often referred to as “holograms”, in identity documents. Our method processes video clips captured with smartphones under common lighting conditions, and is evaluated on two public datasets: MIDV-HOLO and MIDV-2020. Thanks to a weakly-supervised training, we optimize a feature extraction and decision pipeline which achieves a new leading performance on MIDV-HOLO, while maintaining a high recall on documents from MIDV-2020 used as attack samples. It is also the first method, to date, to effectively address the photo replacement attack task, and can be trained on either genuine samples, attack samples, or both for increased performance. By enabling to verify OVD shapes and dynamics with very little supervision, this work opens the way towards the use of massive amounts of unlabeled data to build robust remote identity document verification systems on commodity smartphones. Code is available at https://github.com/EPITAResearchLab/pouliquen.24.icdar.

Continue reading

Automatic vectorization of historical maps: A benchmark

Abstract

Shape vectorization is a key stage of the digitization of large-scale historical maps, especially city maps that exhibit complex and valuable details. Having access to digitized buildings, building blocks, street networks and other geographic content opens numerous new approaches for historical studies such as change tracking, morphological analysis and density estimations. In the context of the digitization of Paris atlases created in the 19th and early 20th centuries, we have designed a supervised pipeline that reliably extract closed shapes from historical maps. This pipeline is based on a supervised edge filtering stage using deep filters, and a closed shape extraction stage using a watershed transform. It relies on probable multiple suboptimal methodological choices that hamper the vectorization performances in terms of accuracy and completeness. Objectively investigating which solutions are the most adequate among the numerous possibilities is comprehensively addressed in this paper. The following contributions are subsequently introduced: (i) we propose an improved training protocol for map digitization; (ii) we introduce a joint optimization of the edge detection and shape extraction stages; (iii) we compare the performance of state-of-the-art deep edge filters with topology-preserving loss functions, including vision transformers; (iv) we evaluate the end-to-end deep learnable watershed against Meyer watershed. We subsequently design the critical path for a fully automatic extraction of key elements of historical maps. All the data, code, benchmark results are freely available at https://github.com/soduco/Benchmark_historical_map_vectorization.

Continue reading

Création d’un graphe de connaissances géohistorique à partir d’annuaires du commerce parisien du 19 ème siècle: Application aux métiers de la photographie

By Solenn Tual, Nathalie Abadie, Bertrand Duménieu, Joseph Chazalon, Edwin Carlinet

2023-07-01

In 34es journées francophones d’ingénierie des connaissances (IC 2023) @ plate-forme intelligence artificielle (PFIA 2023)

Abstract

Les annuaires professionnels anciens, édités à un rythme soutenu dans de nombreuses villes européennes tout au long des XIXe et XXe si‘ecles, forment un corpus de sources unique par son volume et la possibilité qu’ils donnent de suivre les transformations urbaines à travers le prisme des activités professionnelles des habitants, de l’échelle individuelle jusqu’à celle de la ville enti‘ere. L’analyse spatiotemporelle d’un type de commerces au travers des entrées d’annuaires demande cependant un travail considérable de recensement, de transcription et de recoupement manuels. Pour pallier cette difficulté, cet article propose une approche automatique pour construire et visualiser un graphe de connaissances géohistorique des commerces figurant dans des annuaires anciens. L’approche est testée sur des annuaires du commerce parisien du XIXe si‘ecle allant de 1799 à 1908, sur le cas des métiers de la photographie.

Continue reading

A benchmark of nested named entity recognition approaches in historical structured documents

By Solenn Tual, Nathalie Abadie, Joseph Chazalon, Bertrand Duménieu, Edwin Carlinet

2023-06-01

In Proceedings of the international conference on document analysis and recognition (ICDAR 2023)

Abstract

Named Entity Recognition (NER) is a key step in the creation of structured data from digitised historical documents. Traditional NER approaches deal with flat named entities, whereas entities are often nested. For example, a postal address might contain a street name and a number. This work compares three nested NER approaches, including two state-of-the-art approaches using Transformer-based architectures. We introduce a new Transformer-based approach based on joint labelling and semantic weighting of errors, evaluated on a collection of 19th-century Paris trade directories. We evaluate approaches regarding the impact of supervised fine-tuning, unsupervised pre-training with noisy texts, and variation of IOB tagging formats. Our results show that while nested NER approaches enable extracting structured data directly, they do not benefit from the extra knowledge provided during training and reach a performance similar to the base approach on flat entities. Even though all 3 approaches perform well in terms of F1-scores, joint labelling is most suitable for hierarchically structured data. Finally, our experiments reveal the superiority of the IO tagging format on such data.

Continue reading

Linear object detection in document images using multiple object tracking

By Philippe Bernet, Joseph Chazalon, Edwin Carlinet, Alexandre Bourquelot, Élodie Puybareau

2023-06-01

In Proceedings of the international conference on document analysis and recognition (ICDAR 2023)

Abstract

Linear objects convey substantial information about document structure, but are challenging to detect accurately because of degradation (curved, erased) or decoration (doubled, dashed). Many approaches can recover some vector representation, but only one closed-source technique introduced in 1994, based on Kalman filters (a particular case of Multiple Object Tracking algorithm), can perform a pixel-accurate instance segmentation of linear objects and enable to selectively remove them from the original image. We aim at re-popularizing this approach and propose: 1. a framework for accurate instance segmentation of linear objects in document images using Multiple Object Tracking (MOT); 2. document image datasets and metrics which enable both vector- and pixel-based evaluation of linear object detection; 3. performance measures of MOT approaches against modern segment detectors; 4. performance measures of various tracking strategies, exhibiting alternatives to the original Kalman filters approach; and 5. an open-source implementation of a detector which can discriminate instances of curved, erased, dashed, intersecting and/or overlapping linear objects.

Continue reading

A benchmark of named entity recognition approaches in historical documents

By Nathalie Abadie, Edwin Carlinet, Joseph Chazalon, Bertrand Duménieu

2022-04-07

In Proceedings of the 15th IAPR international workshop on document analysis system

Abstract

Named entity recognition (NER) is a necessary step in many pipelines targeting historical documents. Indeed, such natural language processing techniques identify which class each text token belongs to, e.g. “person name”, “location”, “number”. Introducing a new public dataset built from 19th century French directories, we first assess how noisy modern, off-the-shelf OCR are. Then, we compare modern CNN- and Transformer-based NER techniques which can be reasonably used in the context of historical document analysis. We measure their requirements in terms of training data, the effects of OCR noise on their performance, and show how Transformer-based NER can benefit from unsupervised pre-training and supervised fine-tuning on noisy data. Results can be reproduced using resources available at https://github.com/soduco/paper-ner-bench-das22 and https://zenodo.org/record/6394464

Continue reading

QU-BraTS: MICCAI BraTS 2020 challenge on quantifying uncertainty in brain tumor segmentation — Analysis of ranking scores and benchmarking results

By Raghav Mehta, Angelos Filos, Ujjwal Baid, Chiharu Sako, Richard McKinley, Michael Rebsamen, Katrin Dätwyler, Raphael Meier, Piotr Radojewski, Gowtham Krishnan Murugesan, Sahil Nalawade, Chandan Ganesh, Ben Wagner, Fang F. Yu, Baowei Fei, Ananth J. Madhuranthakam, Joseph A. Maldjian, Laura Daza, Catalina Gómez, Pablo Arbeláez, Chengliang Dai, Shuo Wang, Hadrien Reynaud, Yuanhan Mo, Elsa Angelini, Yike Guo, Wenjia Bai, Subhashis Banerjee, Linmin Pei, Murat AK, Sarahi Rosas-González, Ilyess Zemmoura, Clovis Tauber, Minh Hoang Vu, Tufve Nyholm, Tommy Löfstedt, Laura Mora Ballestar, Veronica Vilaplana, Hugh McHugh, Gonzalo Maso Talou, Alan Wang, Jay Patel, Ken Chang, Katharina Hoebel, Mishka Gidwani, Nishanth Arun, Sharut Gupta, Mehak Aggarwal, Praveer Singh, Elizabeth R. Gerstner, Jayashree Kalpathy-Cramer, Nicolas Boutry, Alexis Huard, Lasitha Vidyaratne, Md Monibor Rahman, Khan M. Iftekharuddin, Joseph Chazalon, Élodie Puybareau, Guillaume Tochon, Jun Ma, Mariano Cabezas, Xavier Llado, Arnau Oliver, Liliana Valencia, Sergi Valverde, Mehdi Amian, Mohammadreza Soltaninejad, Andriy Myronenko, Ali Hatamizadeh, Xue Feng, Quan Dou, Nicholas Tustison, Craig Meyer, Nisarg A. Shah, Sanjay Talbar, Marc-André Weber, Abhishek Mahajan, Andras Jakab, Roland Wiest, Hassan M. Fathallah-Shaykh, Arash Nazeri, Mikhail Milchenko, Daniel Marcus, Aikaterini Kotrotsou, Rivka Colen, John Freymann, Justin Kirby, Christos Davatzikos, Bjoern Menze, Spyridon Bakas, Yarin Gal, Tal Arbel

2022-01-09

In Journal of Machine Learning for Biomedical Imaging (MELBA)

Abstract

Deep learning (DL) models have provided state-of-the-art performance in various medical imaging benchmarking challenges, including the Brain Tumor Segmentation (BraTS) challenges. However, the task of focal pathology multi-compartment segmentation (e.g., tumor and lesion sub-regions) is particularly challenging, and potential errors hinder translating DL models into clinical workflows. Quantifying the reliability of DL model predictions in the form of uncertainties could enable clinical review of the most uncertain regions, thereby building trust and paving the way toward clinical translation. Several uncertainty estimation methods have recently been introduced for DL medical image segmentation tasks. Developing scores to evaluate and compare the performance of uncertainty measures will assist the end-user in making more informed decisions. In this study, we explore and evaluate a score developed during the BraTS 2019 and BraTS 2020 task on uncertainty quantification (QU-BraTS) and designed to assess and rank uncertainty estimates for brain tumor multi-compartment segmentation. This score (1) rewards uncertainty estimates that produce high confidence in correct assertions and those that assign low confidence levels at incorrect assertions, and (2) penalizes uncertainty measures that lead to a higher percentage of under-confident correct assertions. We further benchmark the segmentation uncertainties generated by 14 independent participating teams of QU-BraTS 2020, all of which also participated in the main BraTS segmentation task. Overall, our findings confirm the importance and complementary value that uncertainty estimates provide to segmentation algorithms, highlighting the need for uncertainty quantification in medical image analyses. Finally, in favor of transparency and reproducibility, our evaluation code is made publicly available at https://github.com/RagMeh11/QU-BraTS.

Continue reading

Introducing the boundary-aware loss for deep image segmentation

By Minh Ôn Vũ Ngọc, Yizi Chen, Nicolas Boutry, Joseph Chazalon, Edwin Carlinet, Jonathan Fabrizio, Clément Mallet, Thierry Géraud

2021-11-28

In Proceedings of the 32nd british machine vision conference (BMVC)

Abstract

Most contemporary supervised image segmentation methods do not preserve the initial topology of the given input (like the closeness of the contours). One can generally remark that edge points have been inserted or removed when the binary prediction and the ground truth are compared. This can be critical when accurate localization of multiple interconnected objects is required. In this paper, we present a new loss function, called, Boundary-Aware loss (BALoss), based on the Minimum Barrier Distance (MBD) cut algorithm. It is able to locate what we call the leakage pixels and to encode the boundary information coming from the given ground truth. Thanks to this adapted loss, we are able to significantly refine the quality of the predicted boundaries during the learning procedure. Furthermore, our loss function is differentiable and can be applied to any kind of neural network used in image processing. We apply this loss function on the standard U-Net and DC U-Net on Electron Microscopy datasets. They are well-known to be challenging due to their high noise level and to the close or even connected objects covering the image space. Our segmentation performance, in terms of Variation of Information (VOI) and Adapted Rank Index (ARI), are very promising and lead to $\approx{}15%$ better scores of VOI and $\approx{}5%$ better scores of ARI than the state-of-the-art. The code of boundary-awareness loss is freely available at https://github.com/onvungocminh/MBD_BAL

Continue reading