Edwin Carlinet

In PLOS ONE

Abstract

Shape vectorization is a key stage of the digitization of large-scale historical maps, especially city maps that exhibit complex and valuable details. Having access to digitized buildings, building blocks, street networks and other geographic content opens numerous new approaches for historical studies such as change tracking, morphological analysis and density estimations. In the context of the digitization of Paris atlases created in the 19th and early 20th centuries, we have designed a supervised pipeline that reliably extract closed shapes from historical maps. This pipeline is based on a supervised edge filtering stage using deep filters, and a closed shape extraction stage using a watershed transform. It relies on probable multiple suboptimal methodological choices that hamper the vectorization performances in terms of accuracy and completeness. Objectively investigating which solutions are the most adequate among the numerous possibilities is comprehensively addressed in this paper. The following contributions are subsequently introduced: (i) we propose an improved training protocol for map digitization; (ii) we introduce a joint optimization of the edge detection and shape extraction stages; (iii) we compare the performance of state-of-the-art deep edge filters with topology-preserving loss functions, including vision transformers; (iv) we evaluate the end-to-end deep learnable watershed against Meyer watershed. We subsequently design the critical path for a fully automatic extraction of key elements of historical maps. All the data, code, benchmark results are freely available at https://github.com/soduco/Benchmark_historical_map_vectorization.

Création d’un graphe de connaissances géohistorique à partir d’annuaires du commerce parisien du 19 ème siècle: Application aux métiers de la photographie

By Solenn Tual, Nathalie Abadie, Bertrand Duménieu, Joseph Chazalon, Edwin Carlinet

2023-07-01

In 34es journées francophones d’ingénierie des connaissances (IC 2023) @ plate-forme intelligence artificielle (PFIA 2023)

Abstract

Les annuaires professionnels anciens, édités à un rythme soutenu dans de nombreuses villes européennes tout au long des XIXe et XXe si‘ecles, forment un corpus de sources unique par son volume et la possibilité qu’ils donnent de suivre les transformations urbaines à travers le prisme des activités professionnelles des habitants, de l’échelle individuelle jusqu’à celle de la ville enti‘ere. L’analyse spatiotemporelle d’un type de commerces au travers des entrées d’annuaires demande cependant un travail considérable de recensement, de transcription et de recoupement manuels. Pour pallier cette difficulté, cet article propose une approche automatique pour construire et visualiser un graphe de connaissances géohistorique des commerces figurant dans des annuaires anciens. L’approche est testée sur des annuaires du commerce parisien du XIXe si‘ecle allant de 1799 à 1908, sur le cas des métiers de la photographie.

Structural analysis of the additive noise impact on the $\alpha$-tree

By Baptiste Esteban, Guillaume Tochon, Edwin Carlinet, Didier Verna

2023-06-30

In Proceedings of the 20th international conference on computer analysis of images and patterns (CAIP)

Abstract

Hierarchical representations are very convenient tools when working with images. Among them, the $\alpha$-tree is the basis of several powerful hierarchies used for various applications such as image simplifi- cation, object detection, or segmentation. However, it has been demon- strated that these tasks are very sensitive to the noise corrupting the image. While the quality of some $\alpha$-tree applications has been studied, including some with noisy images, the noise impact on the whole struc- ture has been little investigated. Thus, in this paper, we examine the structure of $\alpha$-trees built on images corrupted by some noise with re- spect to the noise level. We compare its effects on constant and natural images, with different kinds of content, and we demonstrate the relation between the noise level and the distribution of every $\alpha$-tree node depth. Furthermore, we extend this study to the node persistence under a given energy criterion, and we propose a novel energy definition that allows assessing the robustness of a region to the noise.

A benchmark of nested named entity recognition approaches in historical structured documents

By Solenn Tual, Nathalie Abadie, Joseph Chazalon, Bertrand Duménieu, Edwin Carlinet

2023-06-01

In Proceedings of the international conference on document analysis and recognition (ICDAR 2023)

Abstract

Named Entity Recognition (NER) is a key step in the creation of structured data from digitised historical documents. Traditional NER approaches deal with flat named entities, whereas entities are often nested. For example, a postal address might contain a street name and a number. This work compares three nested NER approaches, including two state-of-the-art approaches using Transformer-based architectures. We introduce a new Transformer-based approach based on joint labelling and semantic weighting of errors, evaluated on a collection of 19th-century Paris trade directories. We evaluate approaches regarding the impact of supervised fine-tuning, unsupervised pre-training with noisy texts, and variation of IOB tagging formats. Our results show that while nested NER approaches enable extracting structured data directly, they do not benefit from the extra knowledge provided during training and reach a performance similar to the base approach on flat entities. Even though all 3 approaches perform well in terms of F1-scores, joint labelling is most suitable for hierarchically structured data. Finally, our experiments reveal the superiority of the IO tagging format on such data.

Linear object detection in document images using multiple object tracking

By Philippe Bernet, Joseph Chazalon, Edwin Carlinet, Alexandre Bourquelot, Élodie Puybareau

2023-06-01

In Proceedings of the international conference on document analysis and recognition (ICDAR 2023)

Abstract

Linear objects convey substantial information about document structure, but are challenging to detect accurately because of degradation (curved, erased) or decoration (doubled, dashed). Many approaches can recover some vector representation, but only one closed-source technique introduced in 1994, based on Kalman filters (a particular case of Multiple Object Tracking algorithm), can perform a pixel-accurate instance segmentation of linear objects and enable to selectively remove them from the original image. We aim at re-popularizing this approach and propose: 1. a framework for accurate instance segmentation of linear objects in document images using Multiple Object Tracking (MOT); 2. document image datasets and metrics which enable both vector- and pixel-based evaluation of linear object detection; 3. performance measures of MOT approaches against modern segment detectors; 4. performance measures of various tracking strategies, exhibiting alternatives to the original Kalman filters approach; and 5. an open-source implementation of a detector which can discriminate instances of curved, erased, dashed, intersecting and/or overlapping linear objects.

The Dahu graph-cut for interactive segmentation on 2D/3D images

By Minh Ôn Vũ Ngọc, Edwin Carlinet, Jonathan Fabrizio, Thierry Géraud

2022-12-03

In Pattern Recognition

Abstract

Interactive image segmentation is an important application in computer vision for selecting objects of interest in images. Several interactive segmentation methods are based on distance transform algorithms. However, the most known distance transform, geodesic distance, is sensitive to noise in the image and to seed placement. Recently, the Dahu pseudo-distance, a continuous version of the minimum barrier distance (MBD), is proved to be more powerful than the geodesic distance in noisy and blurred images. This paper presents a method for combining the Dahu pseudo-distance with edge information in a graph-cut optimization framework and leveraging each’s complementary strengths. Our method works efficiently on both 2D/3D images and videos. Results show that our method achieves better performance than other distance-based and graph-cut methods, thereby reducing the user’s efforts.

A modern C++ point of <i>view</i> of programming in image processing

By Michaël Roynard, Edwin Carlinet, Thierry Géraud

2022-10-10

In Proceedings of the 21st international conference on generative programming: Concepts & experiences (GPCE 2022)

Abstract

C++ is a multi-paradigm language that enables the programmer to set up efficient image processing algorithms easily. This language strength comes from many aspects. C++ is high-level, so this enables developing powerful abstractions and mixing different programming styles to ease the development. At the same time, C++ is low-level and can fully take advantage of the hardware to deliver the best performance. It is also very portable and highly compatible which allows algorithms to be called from high-level, fast-prototyping languages such as Python or Matlab. One fundamental aspects where C++ shines is generic programming. Generic programming makes it possible to develop and reuse bricks of software on objects (images) of different natures (types) without performance loss. Nevertheless, conciliating genericity, efficiency, and simplicity at the same time is not trivial. Modern C++ (post-2011) has brought new features that made it simpler and more powerful. In this paper, we focus on some C++20 aspects of generic programming: ranges, views, and concepts, and see how they extend to images to ease the development of generic image algorithms while lowering the computation time.

The cost of dynamism in static languages for image processing

By Baptiste Esteban, Edwin Carlinet, Guillaume Tochon, Didier Verna

2022-10-10

In Proceedings of the 21st international conference on generative programming: Concepts & experiences (GPCE 2022)

Abstract

Generic programming is a powerful paradigm abstracting data structures and algorithms to improve their reusability, as long as they respect a given interface. Coupled with a performance-driven language, it is a paradigm of choice for scientific libraries where the implementation of manipulated objects may change depending on their use case, or for performance purposes. In those performance-driven languages, genericity is often implemented statically to perform some optimization. This does not fit well with the dynamism needed to handle objects which may only be known at runtime. Thus, in this article, we evaluate a model that couples static genericity with a dynamic model based on type erasure in the context of image processing. Its cost is assessed by comparing the performance of the implementation of some common image processing algorithms in C++ and Rust, two performance-driven languages supporting some form of genericity. Finally, we demonstrate that compile-time knowledge of some specific information is critical for performance, and also that the runtime overhead depends on the algorithmic scheme in use.

Estimation de la fonction de niveau de bruit pour des images couleurs en utilisant la morphologie mathématique

By Baptiste Esteban, Guillaume Tochon, Edwin Carlinet, Didier Verna

2022-06-15

In 28e colloque sur le traitement du signal et des images

Abstract

Le niveau de bruit est une information importante pour certaines applications de traitement d’image telles que la segmentation ou le débruitage. Par le passé, nous avons proposé une méthode pour estimer ce niveau de bruit en s’adaptant au contenu d’une image en niveau de gris et nous avons montré que ses performances dépassent celle de l’état de l’art. Dans cet article, nous proposons une extension de cette méthode aux images couleurs dont les valeurs multivariées, dénuées de relation d’ordre naturelle, impliquent de nouvelles problématiques. Afin de les résoudre, nous utilisons deux outils provenant de la morphologie mathématique : l’arbre des formes multivarié et l’apprentissage de treillis complet. Enfin, nous confirmons les conclusions de nos précédents travaux pour l’estimation de la fonction de niveau de bruit couleur, montrant que l’adaptation au contenu d’une image donne de meilleures performances que l’utilisation de blocs carrés.

Généricité dynamique pour des algorithmes morphologiques

By Baptiste Esteban, Edwin Carlinet, Guillaume Tochon, Didier Verna

2022-06-15

In 28e colloque sur le traitement du signal et des images

Abstract

La généricité est un paradigme puissant dont l’usage permet d’implémenter un unique algorithme et de l’exécuter sur différents types de données. De ce fait, il est très utilisé lors du développement d’une bibliothèque scientifique, notamment en traitement d’images où les algorithmes peuvent s’appliquer à différents types d’images. Le langage C++ est un langage de choix pour ce genre de bibliothèque. Il supporte ce paradigme et ses applications sont performantes compte tenu de sa nature compilée. Néanmoins, contrairement à des langages dynamiques tels que Python ou Julia, ses capacités en matière d’interactivité, utiles lors des étapes de prototypage d’algorithmes, sont limitées en raison de sa nature statique. Nous proposons donc dans cet article une revue des différentes techniques qui permettent d’utiliser à la fois le polymorphisme statique et dynamique, puis nous évaluons le coût du transfert d’information statique vers des informations connues à l’exécution. En particulier, nous montrons que certaines informations d’une image sont plus importantes que d’autres en matière de performance, et que le surcoût dépend aussi de l’algorithme utilisé.

Automatic vectorization of historical maps: A benchmark

Abstract

Création d’un graphe de connaissances géohistorique à partir d’annuaires du commerce parisien du 19 ème siècle: Application aux métiers de la photographie

Abstract

Structural analysis of the additive noise impact on the $\alpha$-tree

Abstract

A benchmark of nested named entity recognition approaches in historical structured documents

Abstract

Linear object detection in document images using multiple object tracking

Abstract

The Dahu graph-cut for interactive segmentation on 2D/3D images

Abstract

A modern C++ point of <i>view</i> of programming in image processing

Abstract

The cost of dynamism in static languages for image processing

Abstract

Estimation de la fonction de niveau de bruit pour des images couleurs en utilisant la morphologie mathématique

Abstract

Généricité dynamique pour des algorithmes morphologiques

Abstract

Search

Tags