Bertrand Duménieu

Création d’un graphe de connaissances géohistorique à partir d’annuaires du commerce parisien du 19 ème siècle: Application aux métiers de la photographie

By Solenn Tual, Nathalie Abadie, Bertrand Duménieu, Joseph Chazalon, Edwin Carlinet

2023-07-01

In 34es journées francophones d’ingénierie des connaissances (IC 2023) @ plate-forme intelligence artificielle (PFIA 2023)

Abstract Les annuaires professionnels anciens, édités à un rythme soutenu dans de nombreuses villes européennes tout au long des XIXe et XXe si‘ecles, forment un corpus de sources unique par son volume et la possibilité qu’ils donnent de suivre les transformations urbaines à travers le prisme des activités professionnelles des habitants, de l’échelle individuelle jusqu’à celle de la ville enti‘ere. L’analyse spatiotemporelle d’un type de commerces au travers des entrées d’annuaires demande cependant un travail considérable de recensement, de transcription et de recoupement manuels.

Continue reading

A benchmark of nested named entity recognition approaches in historical structured documents

By Solenn Tual, Nathalie Abadie, Joseph Chazalon, Bertrand Duménieu, Edwin Carlinet

2023-06-01

In Proceedings of the international conference on document analysis and recognition (ICDAR 2023)

Abstract Named Entity Recognition (NER) is a key step in the creation of structured data from digitised historical documents. Traditional NER approaches deal with flat named entities, whereas entities are often nested. For example, a postal address might contain a street name and a number. This work compares three nested NER approaches, including two state-of-the-art approaches using Transformer-based architectures. We introduce a new Transformer-based approach based on joint labelling and semantic weighting of errors, evaluated on a collection of 19th-century Paris trade directories.

Continue reading

A benchmark of named entity recognition approaches in historical documents

By Nathalie Abadie, Edwin Carlinet, Joseph Chazalon, Bertrand Duménieu

2022-04-07

In Proceedings of the 15th IAPR international workshop on document analysis system

Abstract Named entity recognition (NER) is a necessary step in many pipelines targeting historical documents. Indeed, such natural language processing techniques identify which class each text token belongs to, e.g. “person name”, “location”, “number”. Introducing a new public dataset built from 19th century French directories, we first assess how noisy modern, off-the-shelf OCR are. Then, we compare modern CNN- and Transformer-based NER techniques which can be reasonably used in the context of historical document analysis.

Continue reading

ICDAR 2021 competition on historical map segmentation

By Joseph Chazalon, Edwin Carlinet, Yizi Chen, Julien Perret, Bertrand Duménieu, Clément Mallet, Thierry Géraud, Vincent Nguyen, Nam Nguyen, Josef Baloun, Ladislav Lenc, Pavel Král

2021-05-17

In Proceedings of the 16th international conference on document analysis and recognition (ICDAR’21)

Abstract This paper presents the final results of the ICDAR 2021 Competition on Historical Map Segmentation (MapSeg), encouraging research on a series of historical atlases of Paris, France, drawn at 1/5000 scale between 1894 and 1937. The competition featured three tasks, awarded separately. Task 1 consists in detecting building blocks and was won by the L3IRIS team using a DenseNet-121 network trained in a weakly supervised fashion. This task is evaluated on 3 large images containing hundreds of shapes to detect.

Continue reading

Vectorization of historical maps using deep edge filtering and closed shape extraction

By Yizi Chen, Edwin Carlinet, Joseph Chazalon, Clément Mallet, Bertrand Duménieu, Julien Perret

2021-05-17

In Proceedings of the 16th international conference on document analysis and recognition (ICDAR’21)

Abstract Maps have been a unique source of knowledge for centuries. Such historical documents provide invaluable information for analyzing the complex spatial transformation of landscapes over important time frames. This is particularly true for urban areas that encompass multiple interleaved research domains (social sciences, economy, etc.). The large amount and significant diversity of map sources call for automatic image processing techniques in order to extract the relevant objects under a vectorial shape.

Continue reading

Combining deep learning and mathematical morphology for historical map segmentation

By Yizi Chen, Edwin Carlinet, Joseph Chazalon, Clément Mallet, Bertrand Duménieu, Julien Perret

2021-02-16

In Proceedings of the IAPR international conference on discrete geometry and mathematical morphology (DGMM)

Abstract The digitization of historical maps enables the study of ancient, fragile, unique, and hardly accessible information sources. Main map features can be retrieved and tracked through the time for subsequent thematic analysis. The goal of this work is the vectorization step, i.e., the extraction of vector shapes of the objects of interest from raster images of maps. We are particularly interested in closed shape detection such as buildings, building blocks, gardens, rivers, etc.

Continue reading