Stefania Calarasanu

From text detection to text segmentation: A unified evaluation scheme

By Stefania Calarasanu, Jonathan Fabrizio, Séverine Dubuisson

2016-10-01

In Proceedings of the 2nd international workshop on robust reading conference (IWRR-ECCV)

Abstract

Current text segmentation evaluation protocols are often incapable of properly handling different scenarios (broken/merged/partial characters). This leads to scores that incorrectly reflect the segmentation accuracy. In this article we propose a new evaluation scheme that overcomes most of the existent drawbacks by extending the EvaLTex protocol (initially designed to evaluate text detection at region level). This new unified platform has numerous advantages: it is able to evaluate a text understanding system at every detection stage and granularity level (paragraph/line/word and now character) by using the same metrics and matching rules; it is robust to all segmentation scenarios; it provides a qualitative and quantitative evaluation and a visual score representation that captures the whole behavior of a segmentation algorithm. Experimental results on nine segmentation algorithms using different evaluation frameworks are also provided to emphasize the interest of our method.

Continue reading

TextCatcher: A method to detect curved and challenging text in natural scenes

By Jonathan Fabrizio, Myriam Robert-Seidowsky, Séverine Dubuisson, Stefania Calarasanu, Raphaël Boissel

2016-04-08

In International Journal on Document Analysis and Recognition

Abstract

In this paper, we propose a text detection algorithm which is hybrid and multi-scale. First, it relies on a connected component-based approach: After the segmentation of the image, a classification step using a new wavelet descriptor spots the letters. A new graph modeling and its traversal procedure allow to form candidate text areas. Second, a texture-based approach discards the false positives. Finally, the detected text areas are precisely cut out and a new binarization step is introduced. The main advantage of our method is that few assumptions are put forward. Thus, “challenging texts” like multi-sized, multi-colored, multi-oriented or curved text can be localized. The efficiency of TextCatcher has been validated on three different datasets: Two come from the ICDAR competition, and the third one contains photographs we have taken with various daily life texts. We present both qualitative and quantitative results.

Continue reading

Towards the rectification of highly distorted texts

By Stefania Calarasanu, Séverine Dubuisson, Jonathan Fabrizio

2016-02-01

In Proceedings of the 11th international conference on computer vision theory and applications (VISAPP)

Abstract

A frequent challenge for many Text Understanding Systems is to tackle the variety of text characteristics in born-digital and natural scene images to which current OCRs are not well adapted. For example, texts in perspective are frequently present in real-word images, but despite the ability of some detectors to accurately localize such text objects, the recognition stage fails most of the time. Indeed, most OCRs are not designed to handle text strings in perspective but rather expect horizontal texts in a parallel-frontal plane to provide a correct transcription. In this paper, we propose a rectification procedure that can correct highly distorted texts, subject to rotation, shearing and perspective deformations. The method is based on an accurate estimation of the quadrangle bounding the deformed text in order to compute a homography to transform this quadrangle (and its content) into a horizontal rectangle. The rectification is validated on the dataset proposed during the ICDAR 2015 Competition on Scene Text Rectification.

Continue reading

What is a good evaluation protocol for text localization systems? Concerns, arguments, comparisons and solutions

Abstract

A trustworthy protocol is essential to evaluate a text detection algorithm in order to, first measure its efficiency and adjust its parameters and, second to compare its performances with those of other algorithms. However, current protocols do not give precise enough evaluations because they use coarse evaluation metrics, and deal with inconsistent matchings between the output of detection algorithms and the ground truth, both often limited to rectangular shapes. In this paper, we propose a new evaluation protocol, named EvaLTex, that solves some of the current problems associated with classical metrics and matching strategies. Our system deals with different kinds of annotations and detection shapes. It also considers different kinds of granularity between detections and ground truth objects and hence provides more realistic and accurate evaluation measures. We use this protocol to evaluate text detection algorithms and highlight some key examples that show that the provided scores are more relevant than those of currently used evaluation protocols.

Continue reading

Improvement of a text detection chain and the proposition of a new evaluation protocol for text detection algorithms

Abstract

The objective of this thesis is twofold. On one hand it targets the proposition of a more accurate evaluation protocol designed for text detection systems that solves some of the existing problems in this area. On the other hand, it focuses on the design of a text rectification procedure used for the correction of highly deformed texts. Text detection systems have gained a significant importance during the last years. The growing number of approaches proposed in the literature requires a rigorous performance evaluation and ranking. In the context of text detection, an evaluation protocol relies on three elements: a reliable text reference, a matching set of rules deciding the relationship between the ground truth and the detections and finally a set of metrics that produce intuitive scores. The few existing evaluation protocols often lack accuracy either due to inconsistent matching procedures that provide unfair scores or due to unrepresentative metrics. Despite these issues, until today, researchers continue to use these protocols to evaluate their work. In this Ph.D thesis we propose a new evaluation protocol for text detection algorithms that tackles most of the drawbacks faced by currently used evaluation methods. This work is focused on three main contributions: firstly, we introduce a complex text reference representation that does not constrain text detectors to adopt a specific detection granularity level or annotation representation; secondly, we propose a set of matching rules capable of evaluating any type of scenario that can occur between a text reference and a detection; and finally we show how we can analyze a set of detection results, not only through a set of metrics, but also through an intuitive visual representation. We use this protocol to evaluate different text detectors and then compare the results with those provided by alternative evaluation methods. A frequent challenge for many Text Understanding Systems is to tackle the variety of text characteristics in born-digital and natural scene images to which current OCRs are not well adapted. For example, texts in perspective are frequently present in real-word images because the camera capture angle is not normal to the plane containing text regions. Despite the ability of some detectors to accurately localize such text objects, the recognition stage fails most of the time. Indeed, most OCRs are not designed to handle text strings in perspective but rather expect horizontal texts in a parallel-frontal plane to provide a correct transcription. All these aspects, together with the proposition of a very challenging dataset, motivated us to propose a rectification procedure capable of correcting highly distorted texts.

Continue reading

Using histogram representation and earth mover’s distance as an evaluation tool for text detection

By Stefania Calarasanu, Jonathan Fabrizio, Séverine Dubuisson

2015-08-01

In Proceedings of the 13th IAPR international conference on document analysis and recognition (ICDAR)

Abstract

In the context of text detection evaluation, it is essential to use protocols that are capable of describing both the quality and the quantity aspects of detection results. In this paper we propose a novel visual representation and evaluation tool that captures the whole nature of a detector by using histograms. First, two histograms (coverage and accuracy) are generated to visualize the different characteristics of a detector. Secondly, we compare these two histograms to a so called optimal one to compute representative and comparable scores. To do so, we introduce the usage of the Earth Mover’s Distance as a reliable evaluation tool to estimate recall and precision scores. Results obtained on the ICDAR 2013 dataset show that this method intuitively characterizes the accuracy of a text detector and gives at a glance various useful characteristics of the analyzed algorithm.

Continue reading