Joseph Chazalon

SmartDoc 2017 video capture: Mobile document acquisition in video mode

By Joseph Chazalon, P. Gomez-Krämer, J.-C. Burie, M. Coustaty, S. Eskenazi, M. Luqman, N. Nayef, M. Rusiñol, N. Sidère, J. M. Ogier.

2017-07-21

In Proceedings of the 1st international workshop on open services and tools for document analysis (ICDAR-OST)

Abstract

As mobile document acquisition using smartphones is getting more and more common, along with the continuous improvement of mobile devices (both in terms of computing power and image quality), we can wonder to which extent mobile phones can replace desktop scanners. Modern applications can cope with perspective distortion and normalize the contrast of a document page captured with a smartphone, and in some cases like bottle labels or posters, smartphones even have the advantage of allowing the acquisition of non-flat or large documents. However, several cases remain hard to handle, such as reflective documents (identity cards, badges, glossy magazine cover, etc.) or large documents for which some regions require an important amount of detail. This paper introduces the SmartDoc 2017 benchmark (named “SmartDoc Video Capture”), which aims at assessing whether capturing documents using the video mode of a smartphone could solve those issues. The task under evaluation is both a stitching and a reconstruction problem, as the user can move the device over different parts of the document to capture details or try to erase highlights. The material released consists of a dataset, an evaluation method and the associated tool, a sample method, and the tools required to extend the dataset. All the components are released publicly under very permissive licenses, and we particularly cared about maximizing the ease of understanding, usage and improvement.

Continue reading

Benchmarking keypoint filtering approaches for document image matching

By E. Royer, Joseph Chazalon, M. Rusiñol, F. Bouchara

2017-07-04

In Proceedings of the 14th international conference on document analysis and recognition (ICDAR)

Abstract

Reducing the amount of keypoints used to index an image is particularly interesting to control processing time and memory usage in real-time document image matching applications, like augmented documents or smartphone applications. This paper benchmarks two keypoint selection methods on a task consisting of reducing keypoint sets extracted from document images, while preserving detection and segmentation accuracy. We first study the different forms of keypoint filtering, and we introduce the use of the CORE selection method on keypoints extracted from document images. Then, we extend a previously published benchmark by including evaluations of the new method, by adding the SURF-BRISK detection/description scheme, and by reporting processing speeds. Evaluations are conducted on the publicly available dataset of ICDAR2015 SmartDOC challenge 1. Finally, we prove that reducing the original keypoint set is always feasible and can be beneficial not only to processing speed but also to accuracy.

Continue reading

Augmented songbook: An augmented reality educational application for raising music awareness

By Marçal Rusiñol, Joseph Chazalon, Katerine Diaz-Chito

2017-06-29

In Multimedia Tools and Applications

Abstract

This paper presents the development of an Augmented Reality mobile application which aims at sensibilizing young children to abstract concepts of music. Such concepts are, for instance, the musical notation or the concept of rythm. Recent studies in Augmented Reality for education suggest that such technologies have multiple benefits for students, including younger ones. As mobile document image acquisition and processing gains maturity on mobile platforms, we explore how it is possible to build a markerless and real-time application to augment the physical documents with didactical animations and interactive content. Given a standard image processing pipeline, we compare the performance of different local descriptors at two key stages of the process. Results suggest alternatives to the SIFT local descriptors, regarding result quality and computationnal efficiency, both for document model identification and pespective transform estimation. All experiments are performed on an original and public dataset we introduce here.

Continue reading