Guillaume Lazzara

Planting, growing and pruning trees: Connected filters applied to document image analysis

By Guillaume Lazzara, Thierry Géraud, Roland Levillain

2013-12-10

In Proceedings of the 11th IAPR international workshop on document analysis systems (DAS)

Abstract

Mathematical morphology, when used in the field of document image analysis and processing, is often limited to some classical yet basic tools. The domain however features a lesser-known class of powerful operators, called connected filters. These operators present an important property: they do not shift nor create contours. Most connected filters are linked to a tree-based representation of an image’s contents, where nodes represent connected components while edges express an inclusion relation. By computing attributes for each node of the tree from the corresponding connected component, then selecting nodes according to an attribute-based criterion, one can either filter or recognize objects in an image. This strategy is very intuitive, efficient, easy to implement, and actually well-suited to processing images of magazines. Examples of applications include image simplification, smart binarization, and object identification.

Continue reading

Efficient multiscale Sauvola’s binarization

By Guillaume Lazzara, Thierry Géraud

2013-04-25

In International Journal of Document Analysis and Recognition (IJDAR)

Abstract

This work is focused on the most commonly used binarization method: Sauvola’s. It performs relatively well on classical documents. However, three main defects remain: the window parameter of Sauvola’s formula do not fit automatically to the content; it is not robust to low contrasts; it is non-invariant with respect to contrast inversion. Thus, on documents such as magazines, the content may not be retrieved correctly which is crucial for indexing purpose. In this paper, we describe how to implement an efficient multiscale implementation of Sauvola’s algorithm in order to guarantee good binarization for both small and large objects inside a single document without adjusting the window size to the content. We also describe on how to implement it in an efficient way, step by step. Pixel-based accuracy and OCR evaluations are performed on more than 120 documents. This implementation remains notably fast compared to the original algorithm. For fixed parameters, text recognition rates and binarization quality are equal or better than other methods on small and medium text and is significantly improved on large text. Thanks to the way it is implemented, it is also more robust on textured text and on image binarization. This implementation extends the robustness of Sauvola’s algorithm by making the results almost insensible to the window size whatever the object sizes. Its properties make it usable in full document analysis toolchains.

Continue reading

The SCRIBO module of the Olena platform: A free software framework for document image analysis

By Guillaume Lazzara, Roland Levillain, Thierry Géraud, Yann Jacquelet, Julien Marquegnies, Arthur Crépin-Leblond

2011-06-01

In Proceedings of the 11th international conference on document analysis and recognition (ICDAR)

Abstract

Electronic documents are being more and more usable thanks to better and more affordable network, storage and computational facilities. But in order to benefit from computer-aided document management, paper documents must be digitized and analyzed. This task may be challenging at several levels. Data may be of multiple types thus requiring different adapted processing chains. The tools to be developed should also take into account the needs and knowledge of users, ranging from a simple graphical application to a complete programming framework. Finally, the data sets to process may be large. In this paper, we expose a set of features that a Document Image Analysis framework should provide to handle the previous issues. In particular, a good strategy to address both flexibility and efficiency issues is the Generic Programming (GP) paradigm. These ideas are implemented as an open source module, SCRIBO, built on top of Olena, a generic and efficient image processing platform. Our solution features services such as preprocessing filters, text detection, page segmentation and document reconstruction (as XML, PDF or HTML documents). This framework, composed of reusable software components, can be used to create full-fledged graphical applications, small utilities, or processing chains to be integrated into third-party projects.

Continue reading