Yizi Chen

Automatic vectorization of historical maps: A benchmark

Abstract

Shape vectorization is a key stage of the digitization of large-scale historical maps, especially city maps that exhibit complex and valuable details. Having access to digitized buildings, building blocks, street networks and other geographic content opens numerous new approaches for historical studies such as change tracking, morphological analysis and density estimations. In the context of the digitization of Paris atlases created in the 19th and early 20th centuries, we have designed a supervised pipeline that reliably extract closed shapes from historical maps. This pipeline is based on a supervised edge filtering stage using deep filters, and a closed shape extraction stage using a watershed transform. It relies on probable multiple suboptimal methodological choices that hamper the vectorization performances in terms of accuracy and completeness. Objectively investigating which solutions are the most adequate among the numerous possibilities is comprehensively addressed in this paper. The following contributions are subsequently introduced: (i) we propose an improved training protocol for map digitization; (ii) we introduce a joint optimization of the edge detection and shape extraction stages; (iii) we compare the performance of state-of-the-art deep edge filters with topology-preserving loss functions, including vision transformers; (iv) we evaluate the end-to-end deep learnable watershed against Meyer watershed. We subsequently design the critical path for a fully automatic extraction of key elements of historical maps. All the data, code, benchmark results are freely available at https://github.com/soduco/Benchmark_historical_map_vectorization.

Continue reading

Modern vectorization and alignement of historical maps: An application to paris atlas (1789-1950)

Abstract

Maps have been a unique source of knowledge for centuries. Such historical documents provide invaluable information for analyzing complex spatial transformations over important time frames. This is particularly true for urban areas that encompass multiple interleaved research domains: humanities, social sciences, etc. The large amount and significant diversity of map sources call for automatic image processing techniques in order to extract the relevant objects as vector features. The complexity of maps (text, noise, digitization artifacts, etc.) has hindered the capacity of proposing a versatile and efficient raster-to-vector approaches for decades. In this thesis, we propose a learnable, reproducible, and reusable solution for the automatic transformation of raster maps into vector objects (building blocks, streets, rivers), focusing on the extraction of closed shapes. Our approach is built upon the complementary strengths of convolutional neural networks which excel at filtering edges while presenting poor topological properties for their outputs, and mathematical morphology, which offers solid guarantees regarding closed shape extraction while being very sensitive to noise. In order to improve the robustness of deep edge filters to noise, we review several, and propose new topology-preserving loss functions which enable to improve the topological properties of the results. We also introduce a new contrast convolution (CConv) layer to investigate how architectural changes can impact such properties. Finally, we investigate the different approaches which can be used to implement each stage, and how to combine them in the most efficient way. Thanks to a shape extraction pipeline, we propose a new alignment procedure for historical map images, and start to leverage the redundancies contained in map sheets with similar contents to propagate annotations, improve vectorization quality, and eventually detect evolution patterns for later analysis or to automatically assess vectorization quality. To evaluate the performance of all methods mentioned above, we released a new dataset of annotated historical map images. It is the first public and open dataset targeting the task of historical map vectorization. We hope that thanks to our publications, public and open releases of datasets, codes and results, our work will benefit a wide range of historical map-related applications.

Continue reading

Introducing the boundary-aware loss for deep image segmentation

By Minh Ôn Vũ Ngọc, Yizi Chen, Nicolas Boutry, Joseph Chazalon, Edwin Carlinet, Jonathan Fabrizio, Clément Mallet, Thierry Géraud

2021-11-28

In Proceedings of the 32nd british machine vision conference (BMVC)

Abstract

Most contemporary supervised image segmentation methods do not preserve the initial topology of the given input (like the closeness of the contours). One can generally remark that edge points have been inserted or removed when the binary prediction and the ground truth are compared. This can be critical when accurate localization of multiple interconnected objects is required. In this paper, we present a new loss function, called, Boundary-Aware loss (BALoss), based on the Minimum Barrier Distance (MBD) cut algorithm. It is able to locate what we call the leakage pixels and to encode the boundary information coming from the given ground truth. Thanks to this adapted loss, we are able to significantly refine the quality of the predicted boundaries during the learning procedure. Furthermore, our loss function is differentiable and can be applied to any kind of neural network used in image processing. We apply this loss function on the standard U-Net and DC U-Net on Electron Microscopy datasets. They are well-known to be challenging due to their high noise level and to the close or even connected objects covering the image space. Our segmentation performance, in terms of Variation of Information (VOI) and Adapted Rank Index (ARI), are very promising and lead to $\approx{}15%$ better scores of VOI and $\approx{}5%$ better scores of ARI than the state-of-the-art. The code of boundary-awareness loss is freely available at https://github.com/onvungocminh/MBD_BAL

Continue reading

ICDAR 2021 competition on historical map segmentation

By Joseph Chazalon, Edwin Carlinet, Yizi Chen, Julien Perret, Bertrand Duménieu, Clément Mallet, Thierry Géraud, Vincent Nguyen, Nam Nguyen, Josef Baloun, Ladislav Lenc, Pavel Král

2021-05-17

In Proceedings of the 16th international conference on document analysis and recognition (ICDAR’21)

Abstract

This paper presents the final results of the ICDAR 2021 Competition on Historical Map Segmentation (MapSeg), encouraging research on a series of historical atlases of Paris, France, drawn at 1/5000 scale between 1894 and 1937. The competition featured three tasks, awarded separately. Task 1 consists in detecting building blocks and was won by the L3IRIS team using a DenseNet-121 network trained in a weakly supervised fashion. This task is evaluated on 3 large images containing hundreds of shapes to detect. Task 2 consists in segmenting map content from the larger map sheet, and was won by the UWB team using a U-Net-like FCN combined with a binarization method to increase detection edge accuracy. Task 3 consists in locating intersection points of geo-referencing lines, and was also won by the UWB team who used a dedicated pipeline combining binarization, line detection with Hough transform, candidate filtering, and template matching for intersection refinement. Tasks 2 and 3 are evaluated on 95 map sheets with complex content. Dataset, evaluation tools and results are available under permissive licensing at https://icdar21-mapseg.github.io/.

Continue reading

Vectorization of historical maps using deep edge filtering and closed shape extraction

By Yizi Chen, Edwin Carlinet, Joseph Chazalon, Clément Mallet, Bertrand Duménieu, Julien Perret

2021-05-17

In Proceedings of the 16th international conference on document analysis and recognition (ICDAR’21)

Abstract

Maps have been a unique source of knowledge for centuries. Such historical documents provide invaluable information for analyzing the complex spatial transformation of landscapes over important time frames. This is particularly true for urban areas that encompass multiple interleaved research domains (social sciences, economy, etc.). The large amount and significant diversity of map sources call for automatic image processing techniques in order to extract the relevant objects under a vectorial shape. The complexity of maps (text, noise, digitization artifacts, etc.) has hindered the capacity of proposing a versatile and efficient raster-to-vector approaches for decades. We propose a learnable, reproducible, and reusable solution for the automatic transformation of raster maps into vector objects (building blocks, streets, rivers). It is built upon the complementary strength of mathematical morphology and convolutional neural networks through efficient edge filtering. Evenmore, we modify ConnNet and combine with deep edge filtering architecture to make use of pixel connectivity information and built an end-to-end system without requiring any post-processing techniques. In this paper, we focus on the comprehensive benchmark on various architectures on multiple datasets coupled with a novel vectorization step. Our experimental results on a new public dataset using COCO Panoptic metric exhibit very encouraging results confirmed by a qualitative analysis of the success and failure cases of our approach. Code, dataset, results and extra illustrations are freely available at https://github.com/soduco/ICDAR-2021-Vectorization.

Continue reading

Combining deep learning and mathematical morphology for historical map segmentation

By Yizi Chen, Edwin Carlinet, Joseph Chazalon, Clément Mallet, Bertrand Duménieu, Julien Perret

2021-02-16

In Proceedings of the IAPR international conference on discrete geometry and mathematical morphology (DGMM)

Abstract

The digitization of historical maps enables the study of ancient, fragile, unique, and hardly accessible information sources. Main map features can be retrieved and tracked through the time for subsequent thematic analysis. The goal of this work is the vectorization step, i.e., the extraction of vector shapes of the objects of interest from raster images of maps. We are particularly interested in closed shape detection such as buildings, building blocks, gardens, rivers, etc. in order to monitor their temporal evolution. Historical map images present significant pattern recognition challenges. The extraction of closed shapes by using traditional Mathematical Morphology (MM) is highly challenging due to the overlapping of multiple map features and texts. Moreover, state-of-the-art Convolutional Neural Networks (CNN) are perfectly designed for content image filtering but provide no guarantee about closed shape detection. Also, the lack of textural and color information of historical maps makes it hard for CNN to detect shapes that are represented by only their boundaries. Our contribution is a pipeline that combines the strengths of CNN (efficient edge detection and filtering) and MM (guaranteed extraction of closed shapes) in order to achieve such a task. The evaluation of our approach on a public dataset shows its effectiveness for extracting the closed boundaries of objects in historical maps.

Continue reading