Publications

Derived-term automata of multitape rational expressions

By Akim Demaille

2016-04-26

In Proceedings of implementation and application of automata, 21st international conference (CIAA’16)

Abstract

We introduce (weighted) rational expressions to denote series over Cartesian products of monoids. To this end, we propose the operator $\mid$ to build multitape expressions such as $(a^+\mid x + b^+\mid y)^*$. We define expansions, which generalize the concept of derivative of a rational expression, but relieved from the need of a free monoid. We propose an algorithm based on expansions to build multitape automata from multitape expressions.

Continue reading

Region-based classification of remote sensing images with the morphological tree of shapes

By Gabriele Cavallaro, Mauro Dalla Mura, Edwin Carlinet, Thierry Géraud, Nicola Falco, Jón Atli Benediktsson

2016-04-12

In Proceedings of the IEEE international geoscience and remote sensing symposium (IGARSS)

Abstract

Satellite image classification is a key task used in remote sensing for the automatic interpretation of a large amount of information. Today there exist many types of classification algorithms using advanced image processing methods enhancing the classification accuracy rate. One of the best state-of-the-art methods which improves significantly the classification of complex scenes relies on Self-Dual Attribute Profiles (SDAPs). In this approach, the underlying representation of an image is the Tree of Shapes, which encodes the inclusion of connected components of the image. The SDAP computes for each pixel a vector of attributes providing a local multiscale representation of the information and hence leading to a fine description of the local structures of the image. Instead of performing a pixel-wise classification on features extracted from the Tree of Shapes, it is proposed to directly classify its nodes. Extending a specific interactive segmentation algorithm enables it to deal with the multi-class classification problem. The method does not involve any statistical learning and it is based entirely on morphological information related to the tree. Consequently, a very simple and effective region-based classifier relying on basic attributes is presented.

Continue reading

Hierarchical segmentation using tree-based shape spaces

By Yongchao Xu, Edwin Carlinet, Thierry Géraud, Laurent Najman

2016-04-11

In IEEE Transactions on Pattern Analysis and Machine Intelligence

Abstract

Current trends in image segmentation are to compute a hierarchy of image segmentations from fine to coarse. A classical approach to obtain a single meaningful image partition from a given hierarchy is to cut it in an optimal way, following the seminal approach of the scale-set theory. While interesting in many cases, the resulting segmentation, being a non-horizontal cut, is limited by the structure of the hierarchy. In this paper, we propose a novel approach that acts by transforming an input hierarchy into a new saliency map. It relies on the notion of shape space: a graph representation of a set of regions extracted from the image. Each region is characterized with an attribute describing it. We weigh the boundaries of a subset of meaningful regions (local minima) in the shape space by extinction values based on the attribute. This extinction-based saliency map represents a new hierarchy of segmentations highlighting regions having some specific characteristics. Each threshold of this map represents a segmentation which is generally different from any cut of the original hierarchy. This new approach thus enlarges the set of possible partition results that can be extracted from a given hierarchy. Qualitative and quantitative illustrations demonstrate the usefulness of the proposed method.

Continue reading

TextCatcher: A method to detect curved and challenging text in natural scenes

By Jonathan Fabrizio, Myriam Robert-Seidowsky, Séverine Dubuisson, Stefania Calarasanu, Raphaël Boissel

2016-04-08

In International Journal on Document Analysis and Recognition

Abstract

In this paper, we propose a text detection algorithm which is hybrid and multi-scale. First, it relies on a connected component-based approach: After the segmentation of the image, a classification step using a new wavelet descriptor spots the letters. A new graph modeling and its traversal procedure allow to form candidate text areas. Second, a texture-based approach discards the false positives. Finally, the detected text areas are precisely cut out and a new binarization step is introduced. The main advantage of our method is that few assumptions are put forward. Thus, “challenging texts” like multi-sized, multi-colored, multi-oriented or curved text can be localized. The efficiency of TextCatcher has been validated on three different datasets: Two come from the ICDAR competition, and the third one contains photographs we have taken with various daily life texts. We present both qualitative and quantitative results.

Continue reading

Type-checking of heterogeneous sequences in Common Lisp

By Jim Newton, Akim Demaille, Didier Verna

2016-03-25

In ELS 2016, the 9th european lisp symposium

Abstract

We introduce the abstract concept of rational type expression and show its relationship to rational language theory. We further present a concrete syntax, regular type expression, and a Common Lisp implementation thereof which allows the programmer to declaratively express the types of heterogeneous sequences in a way which is natural in the Common Lisp language. The implementation uses techniques well known and well founded in rational language theory, in particular the use of the Brzozowski derivative and deterministic automata to reach a solution which can match a sequence in linear time. We illustrate the concept with several motivating examples, and finally explain many details of its implementation.

Continue reading

Efficient dynamic type checking of heterogeneous sequences

Abstract

This report provides detailed background of our development of the rational type expression, concrete syntax, regular type expression, and a Common Lisp implementation which allows the programmer to declarative express the types of heterogeneous sequences in a way which is natural in the Common Lisp language. We present a brief theoretical background in rational language theory, which facilitates the development of rational type expressions, in particular the use of the Brzozowski derivative and deterministic automata to arrive at a solution which can match a sequence in linear time. We illustrate the concept with several motivating examples, and finally explain many details of its implementation.

Continue reading

Towards the rectification of highly distorted texts

By Stefania Calarasanu, Séverine Dubuisson, Jonathan Fabrizio

2016-02-01

In Proceedings of the 11th international conference on computer vision theory and applications (VISAPP)

Abstract

A frequent challenge for many Text Understanding Systems is to tackle the variety of text characteristics in born-digital and natural scene images to which current OCRs are not well adapted. For example, texts in perspective are frequently present in real-word images, but despite the ability of some detectors to accurately localize such text objects, the recognition stage fails most of the time. Indeed, most OCRs are not designed to handle text strings in perspective but rather expect horizontal texts in a parallel-frontal plane to provide a correct transcription. In this paper, we propose a rectification procedure that can correct highly distorted texts, subject to rotation, shearing and perspective deformations. The method is based on an accurate estimation of the quadrangle bounding the deformed text in order to compute a homography to transform this quadrangle (and its content) into a horizontal rectangle. The rectification is validated on the dataset proposed during the ICDAR 2015 Competition on Scene Text Rectification.

Continue reading

What is a good evaluation protocol for text localization systems? Concerns, arguments, comparisons and solutions

Abstract

A trustworthy protocol is essential to evaluate a text detection algorithm in order to, first measure its efficiency and adjust its parameters and, second to compare its performances with those of other algorithms. However, current protocols do not give precise enough evaluations because they use coarse evaluation metrics, and deal with inconsistent matchings between the output of detection algorithms and the ground truth, both often limited to rectangular shapes. In this paper, we propose a new evaluation protocol, named EvaLTex, that solves some of the current problems associated with classical metrics and matching strategies. Our system deals with different kinds of annotations and detection shapes. It also considers different kinds of granularity between detections and ground truth objects and hence provides more realistic and accurate evaluation measures. We use this protocol to evaluate text detection algorithms and highlight some key examples that show that the provided scores are more relevant than those of currently used evaluation protocols.

Continue reading

Improvement of a text detection chain and the proposition of a new evaluation protocol for text detection algorithms

Abstract

The objective of this thesis is twofold. On one hand it targets the proposition of a more accurate evaluation protocol designed for text detection systems that solves some of the existing problems in this area. On the other hand, it focuses on the design of a text rectification procedure used for the correction of highly deformed texts. Text detection systems have gained a significant importance during the last years. The growing number of approaches proposed in the literature requires a rigorous performance evaluation and ranking. In the context of text detection, an evaluation protocol relies on three elements: a reliable text reference, a matching set of rules deciding the relationship between the ground truth and the detections and finally a set of metrics that produce intuitive scores. The few existing evaluation protocols often lack accuracy either due to inconsistent matching procedures that provide unfair scores or due to unrepresentative metrics. Despite these issues, until today, researchers continue to use these protocols to evaluate their work. In this Ph.D thesis we propose a new evaluation protocol for text detection algorithms that tackles most of the drawbacks faced by currently used evaluation methods. This work is focused on three main contributions: firstly, we introduce a complex text reference representation that does not constrain text detectors to adopt a specific detection granularity level or annotation representation; secondly, we propose a set of matching rules capable of evaluating any type of scenario that can occur between a text reference and a detection; and finally we show how we can analyze a set of detection results, not only through a set of metrics, but also through an intuitive visual representation. We use this protocol to evaluate different text detectors and then compare the results with those provided by alternative evaluation methods. A frequent challenge for many Text Understanding Systems is to tackle the variety of text characteristics in born-digital and natural scene images to which current OCRs are not well adapted. For example, texts in perspective are frequently present in real-word images because the camera capture angle is not normal to the plane containing text regions. Despite the ability of some detectors to accurately localize such text objects, the recognition stage fails most of the time. Indeed, most OCRs are not designed to handle text strings in perspective but rather expect horizontal texts in a parallel-frontal plane to provide a correct transcription. All these aspects, together with the proposition of a very challenging dataset, motivated us to propose a rectification procedure capable of correcting highly distorted texts.

Continue reading

A tree of shapes for multivariate images

Abstract

Nowadays, the demand for multi-scale and region-based analysis in many computer vision and pattern recognition applications is obvious. No one would consider a pixelbased approach as a good candidate to solve such problems. To meet this need, the Mathematical Morphology (MM) framework has supplied region-based hierarchical representations of images such as the Tree of Shapes (ToS). The ToS represents the image in terms of a tree of the inclusion of its level-lines. The ToS is thus self-dual and contrastchange invariant which make it well-adapted for high-level image processing. Yet, it is only defined on grayscale images and most attempts to extend it on multivariate images - e.g. by imposing an “arbitrary” total ordering - are not satisfactory. In this dissertation, we present the Multivariate Tree of Shapes (MToS) as a novel approach to extend the grayscale ToS on multivariate images. This representation is a mix of the ToS’s computed marginally on each channel of the image; it aims at merging the marginal shapes in a “sensible” way by preserving the maximum number of inclusion. The method proposed has theoretical foundations expressing the ToS in terms of a topographic map of the curvilinear total variation computed from the image border; which has allowed its extension on multivariate data. In addition, the MToS features similar properties as the grayscale ToS, the most important one being its invariance to any marginal change of contrast and any marginal inversion of contrast (a somewhat “self-duality” in the multidimensional case). As the need for efficient image processing techniques is obvious regarding the larger and larger amount of data to process, we propose an efficient algorithm that can build the MToS in quasi-linear time w.r.t. the number of pixels and quadratic w.r.t. the number of channels. We also propose tree-based processing algorithms to demonstrate in practice, that the MToS is a versatile, easy-to-use, and efficient structure. Eventually, to validate the soundness of our approach, we propose some experiments testing the robustness of the structure to non-relevant components (e.g. with noise or with low dynamics) and we show that such defaults do not affect the overall structure of the MToS. In addition, we propose many real-case applications using the MToS. Many of them are just a slight modification of methods employing the “regular” ToS and adapted to our new structure. For example, we successfully use the MToS for image filtering, image simplification, image segmentation, image classification and object detection. From these applications, we show that the MToS generally outperforms its ToS-based counterpart, demonstrating the potential of our approach.

Continue reading