Jonathan Fabrizio

TextCatcher: A method to detect curved and challenging text in natural scenes

By Jonathan Fabrizio, Myriam Robert-Seidowsky, Séverine Dubuisson, Stefania Calarasanu, Raphaël Boissel

2016-04-08

In International Journal on Document Analysis and Recognition

Abstract

In this paper, we propose a text detection algorithm which is hybrid and multi-scale. First, it relies on a connected component-based approach: After the segmentation of the image, a classification step using a new wavelet descriptor spots the letters. A new graph modeling and its traversal procedure allow to form candidate text areas. Second, a texture-based approach discards the false positives. Finally, the detected text areas are precisely cut out and a new binarization step is introduced. The main advantage of our method is that few assumptions are put forward. Thus, “challenging texts” like multi-sized, multi-colored, multi-oriented or curved text can be localized. The efficiency of TextCatcher has been validated on three different datasets: Two come from the ICDAR competition, and the third one contains photographs we have taken with various daily life texts. We present both qualitative and quantitative results.

Continue reading

Towards the rectification of highly distorted texts

By Stefania Calarasanu, Séverine Dubuisson, Jonathan Fabrizio

2016-02-01

In Proceedings of the 11th international conference on computer vision theory and applications (VISAPP)

Abstract

A frequent challenge for many Text Understanding Systems is to tackle the variety of text characteristics in born-digital and natural scene images to which current OCRs are not well adapted. For example, texts in perspective are frequently present in real-word images, but despite the ability of some detectors to accurately localize such text objects, the recognition stage fails most of the time. Indeed, most OCRs are not designed to handle text strings in perspective but rather expect horizontal texts in a parallel-frontal plane to provide a correct transcription. In this paper, we propose a rectification procedure that can correct highly distorted texts, subject to rotation, shearing and perspective deformations. The method is based on an accurate estimation of the quadrangle bounding the deformed text in order to compute a homography to transform this quadrangle (and its content) into a horizontal rectangle. The rectification is validated on the dataset proposed during the ICDAR 2015 Competition on Scene Text Rectification.

Continue reading

What is a good evaluation protocol for text localization systems? Concerns, arguments, comparisons and solutions

Abstract

A trustworthy protocol is essential to evaluate a text detection algorithm in order to, first measure its efficiency and adjust its parameters and, second to compare its performances with those of other algorithms. However, current protocols do not give precise enough evaluations because they use coarse evaluation metrics, and deal with inconsistent matchings between the output of detection algorithms and the ground truth, both often limited to rectangular shapes. In this paper, we propose a new evaluation protocol, named EvaLTex, that solves some of the current problems associated with classical metrics and matching strategies. Our system deals with different kinds of annotations and detection shapes. It also considers different kinds of granularity between detections and ground truth objects and hence provides more realistic and accurate evaluation measures. We use this protocol to evaluate text detection algorithms and highlight some key examples that show that the provided scores are more relevant than those of currently used evaluation protocols.

Continue reading

Using histogram representation and earth mover’s distance as an evaluation tool for text detection

By Stefania Calarasanu, Jonathan Fabrizio, Séverine Dubuisson

2015-08-01

In Proceedings of the 13th IAPR international conference on document analysis and recognition (ICDAR)

Abstract

In the context of text detection evaluation, it is essential to use protocols that are capable of describing both the quality and the quantity aspects of detection results. In this paper we propose a novel visual representation and evaluation tool that captures the whole nature of a detector by using histograms. First, two histograms (coverage and accuracy) are generated to visualize the different characteristics of a detector. Secondly, we compare these two histograms to a so called optimal one to compute representative and comparable scores. To do so, we introduce the usage of the Earth Mover’s Distance as a reliable evaluation tool to estimate recall and precision scores. Results obtained on the ICDAR 2013 dataset show that this method intuitively characterizes the accuracy of a text detector and gives at a glance various useful characteristics of the analyzed algorithm.

Continue reading

A self-adaptive likelihood function for tracking with particle filter

By Séverine Dubuisson, Myriam Robert-Seidowsky, Jonathan Fabrizio

2015-03-01

In Proceedings of the 10th international conference on computer vision theory and applications (VISAPP)

Abstract

The particle filter is known to be efficient for visual tracking. However, its parameters are empirically fixed, depending on the target application, the video sequences and the context. In this paper, we introduce a new algorithm which automatically adjusts “on-line" two majors of them: the correction and the propagation parameters. Our purpose is to determine, for each frame of a video, the optimal value of the correction parameter and to adjust the propagation one to improve the tracking performance. On one hand, our experimental results show that the common settings of particle filter are sub-optimal. On another hand, we prove that our approach achieves a lower tracking error without needing tuning these parameters. Our adaptive method allows to track objects in complex conditions (illumination changes, cluttered background, etc.) without adding any computational cost compared to the common usage with fixed parameters.

Continue reading

TextTrail: A robust text tracking algorithm in wild environments

By Myriam Robert-Seidowsky, Jonathan Fabrizio, Séverine Dubuisson

2015-03-01

In Proceedings of the 10th international conference on computer vision theory and applications (VISAPP)

Abstract

In this paper, we propose TextTrail, a robust new algorithm dedicated to text tracking in uncontrolled environments (strong motion of camera and objects, partial occlusions, blur, etc.). It is based on a particle filter framework whose correction step has been improved. First, we compare some likelihood functions and introduce a new one that integrates tangent distance. We show that the likelihood function has a strong influence on the text tracking performances. Secondly, we compare our tracker with another and finally present an example of application. TextTrail has been tested on real video sequences and has proven its efficiency. In particular, it can track texts in complex situations starting from only one detection step without needing another one to reinitialize the tracking model.

Continue reading

A precise skew estimation algorithm for document images using KNN clustering and fourier transform

By Jonathan Fabrizio

2014-05-26

In Proceedings of the 21st international conference on image processing (ICIP)

Abstract

In this article, we propose a simple and precise skew estimation algorithm for binarized document images. The estimation is performed in the frequency domain. To get a precise result, the Fourier transform is not applied to the document itself but the document is preprocessed: all regions of the document are clustered using a KNN and contours of grouped regions are smoothed using the convex hull to form more regular shapes, with better orientation. No assumption has been made concerning the nature or the content of the document. This method has been shown to be very accurate and was ranked first at the DISEC’13 contest, during the ICDAR competitions.

Continue reading

Text detection in street level image

By Jonathan Fabrizio, Beatriz Marcotegui, Matthieu Cord

2013-11-05

In Pattern Analysis and Applications

Abstract

Text detection system for natural images is a very challenging task in Computer Vision. Image acquisition introduces distortion in terms of perspective, blurring, illumination, and characters which may have very different shape, size, and color. We introduce in this article a full text detection scheme. Our architecture is based on a new process to combine a hypothesis generation step to get potential boxes of text and a hypothesis validation step to filter false detections. The hypothesis generation process relies on a new efficient segmentation method based on a morphological operator. Regions are then filtered and classified using shape descriptors based on Fourier, Pseudo Zernike moments and an original polar descriptor, which is invariant to rotation. Classification process relies on three SVM classifiers combined in a late fusion scheme. Detected characters are finally grouped to generate our text box hypotheses. Validation step is based on a global SVM classification of the box content using dedicated descriptors adapted from the HOG approach. Results on the well-known ICDAR database are reported showing that our method is competitive. Evaluation protocol and metrics are deeply discussed and results on a very challenging street-level database are also proposed.

Continue reading

Motion compensation based on tangent distance prediction for video compression

By Jonathan Fabrizio, Séverine Dubuisson, Dominique Béréziat

2012-02-09

In Signal Processing: Image Communication

Abstract

We present a new algorithm for motion compensation that uses a motion estimation method based on tangent distance. The method is compared with a Block-Matching based approach in various common situations. Whereas Block-Matching algorithms usually only predict positions of blocks over time, our method also predicts the evolution of pixels into these blocks. The prediction error is then drastically decreased. The method is implemented into the Theora codec proving that this algorithm improves the video codec performances.

Continue reading

SnooperText: A multiresolution system for text detection in complex visual scenes

By Rodrigo Minetto, Nicolas Thome, Matthieu Cord, Jonathan Fabrizio, Beatriz Marcotegui

2010-12-31

In Proceedings of the IEEE international conference on image processing (ICIP)

Abstract

Text detection in natural images remains a very challenging task. For instance, in an urban context, the detection is very difficult due to large variations in terms of shape, size, color, orientation, and the image may be blurred or have irregular illumination, etc. In this paper, we describe a robust and accurate multiresolution approach to detect and classify text regions in such scenarios. Based on generation/validation paradigm, we first segment images to detect character regions with a multiresolution algorithm able to manage large character size variations. The segmented regions are then filtered out using shapebased classification, and neighboring characters are merged to generate text hypotheses. A validation step computes a region signature based on texture analysis to reject false positives. We evaluate our algorithm in two challenging databases, achieving very good results

Continue reading