Didier Verna

A large scale format compliance checker for TeX font metrics

By Didier Verna

2024-09-01

In TUGboat

Abstract

We present tfm-validate, a TeX Font Metrics format checker. The library’s core functionality is to inspect TFM files and report any discovered compliance issue. It can be run on individual files or complete directory trees. tfm-validate also provides a convenience function to (in)validate a local TeXLive installation. When run this way, the library processes every TFM file in the distribution and generates a website aggregating all the discovered non-compliance issues. One public instance of tfm-validate is now automatically triggered on a daily basis. The corresponding website is available at https://texlive.info/tfm-validate/.

Continue reading

Similarity problems in paragraph justification: An extension to the Knuth-Plass algorithm

By Didier Verna

2024-08-01

In Proceedings of the ACM symposium on document engineering 2024

Abstract

In high quality typography, consecutive lines beginning or ending with the same word or sequence of characters is considered a defect. We have implemented an extension to TeX’s paragraph justification algorithm which handles this problem. Experimentation shows that getting rid of similarities is both worth addressing and achievable. Our extension automates the detection and avoidance of similarities while leaving the ultimate decision to the professional typographer, thanks to a new adjustable cursor. The extension is simple and lightweight, making it a useful addition to production engines.

Continue reading

The Quickref cohort

By Didier Verna

2024-05-01

In ELS 2024, the 17th european lisp symposium

Abstract

The internal architecture of Declt, our reference manual generator for Common Lisp libraries, is currently evolving towards a three-stage pipeline in which the information gathered for documentation purposes is first reified into a formalized set of object-oriented data structures. A side-effect of this evolution is the ability to dump that information for other purposes than documentation. We demonstrate this ability applied to the complete Quicklisp ecosystem. The resulting “cohort” includes more than half a million programmatic definitions, and can be used to gain insight into the morphology of Common Lisp software.

Continue reading

Interactive and real-time typesetting for demonstration and experimentation: <span style="font-variant:small-caps;">ETAP</span>

By Didier Verna

2023-09-01

In TUGboat

Abstract

In general, typesetting experimentation is not a very practical thing to do. WYSIWYG typesetting systems are very reactive but do not offer highly configurable algorithms, and TeX, with its separate development / compilation / visualization phases, is not as interactive as its WYSIWYG competitors. Being able to experiment with typesetting algorithms interactively and in real-time is nevertheless desirable, for instance for demonstration purposes, or for rapid prototyping and debugging of new ideas. We present ETAP (Experimental Typesetting Algorithms Platform), a tool written to ease typesetting experimentation and demonstration. ETAP currently provides several paragraph justification algorithms, all with many configuration options such as kerning, ligatures, flexible spaces, sloppiness, hyphenation, etc. The resulting paragraph is displayed with many visual hints as well, such as paragraph, character, and line boxes, baselines, over/underfullness hints, hyphenation clues, etc. All these parameters, along with the desired paragraph width, are adjustable interactively through a GUI, and the resulting paragraph is displayed and updated in real-time. But ETAP can also be used without, or in conjunction with the GUI, as a scriptable application. In particular, it is able to generate all sorts of statistical reports or charts on the behavior of the various algorithms, for instance, the number of over/underfull boxes per paragraph width, the average compression or stretch ratio per line, whatever else you want. This allows you to quickly demonstrate or evaluate the comparative behavior or merits of the provided algorithms, or whichever you may want to add to the pool.

Continue reading

Structural analysis of the additive noise impact on the $\alpha$-tree

By Baptiste Esteban, Guillaume Tochon, Edwin Carlinet, Didier Verna

2023-06-30

In Proceedings of the 20th international conference on computer analysis of images and patterns (CAIP)

Abstract

Hierarchical representations are very convenient tools when working with images. Among them, the $\alpha$-tree is the basis of several powerful hierarchies used for various applications such as image simplifi- cation, object detection, or segmentation. However, it has been demon- strated that these tasks are very sensitive to the noise corrupting the image. While the quality of some $\alpha$-tree applications has been studied, including some with noisy images, the noise impact on the whole struc- ture has been little investigated. Thus, in this paper, we examine the structure of $\alpha$-trees built on images corrupted by some noise with re- spect to the noise level. We compare its effects on constant and natural images, with different kinds of content, and we demonstrate the relation between the noise level and the distribution of every $\alpha$-tree node depth. Furthermore, we extend this study to the node persistence under a given energy criterion, and we propose a novel energy definition that allows assessing the robustness of a region to the noise.

Continue reading

A MOP-based implementation for method combinations

By Didier Verna

2023-04-01

In ELS 2023, the 16th european lisp symposium

Abstract

In traditional object-oriented languages, the dynamic dispatch algorithm is hardwired to select and execute the most specific method in a polymorphic call. In CLOS, the Common Lisp Object System, an abstraction known as “method combinations” allows the programmer to define their own dispatch scheme. When Common Lisp was standardized, method combinations were not mature enough to be fully specified.In 2018, using SBCL as a research vehicle, we analyzed the unfortunate consequences of this under-specification and proposed a layer on top of method combinations designed to both correct a number of observed behavioral inconsistencies, and propose an extension called “alternative combinators”. Following this work, SBCL underwent a number of internal changes that fixed the reported inconsistencies, although in a way that hindered further experimentation.In this paper, we analyze SBCL’s new method combinations implementation and we propose an alternative design. Our solution is standard-compliant so any Lisp implementation can potentially use it. It is also based on the MOP, meaning that it is extensible, which restores the opportunity for further experimentation. In particular, we revisit our former “alternative combinators” extension, broken after 2018, and demonstrate that provided with this new infrastructure, it can be re-implemented in a much simpler and non-intrusive way.

Continue reading

The cost of dynamism in static languages for image processing

By Baptiste Esteban, Edwin Carlinet, Guillaume Tochon, Didier Verna

2022-10-10

In Proceedings of the 21st international conference on generative programming: Concepts & experiences (GPCE 2022)

Abstract

Generic programming is a powerful paradigm abstracting data structures and algorithms to improve their reusability, as long as they respect a given interface. Coupled with a performance-driven language, it is a paradigm of choice for scientific libraries where the implementation of manipulated objects may change depending on their use case, or for performance purposes. In those performance-driven languages, genericity is often implemented statically to perform some optimization. This does not fit well with the dynamism needed to handle objects which may only be known at runtime. Thus, in this article, we evaluate a model that couples static genericity with a dynamic model based on type erasure in the context of image processing. Its cost is assessed by comparing the performance of the implementation of some common image processing algorithms in C++ and Rust, two performance-driven languages supporting some form of genericity. Finally, we demonstrate that compile-time knowledge of some specific information is critical for performance, and also that the runtime overhead depends on the algorithmic scheme in use.

Continue reading

Estimation de la fonction de niveau de bruit pour des images couleurs en utilisant la morphologie mathématique

By Baptiste Esteban, Guillaume Tochon, Edwin Carlinet, Didier Verna

2022-06-15

In 28e colloque sur le traitement du signal et des images

Abstract

Le niveau de bruit est une information importante pour certaines applications de traitement d’image telles que la segmentation ou le débruitage. Par le passé, nous avons proposé une méthode pour estimer ce niveau de bruit en s’adaptant au contenu d’une image en niveau de gris et nous avons montré que ses performances dépassent celle de l’état de l’art. Dans cet article, nous proposons une extension de cette méthode aux images couleurs dont les valeurs multivariées, dénuées de relation d’ordre naturelle, impliquent de nouvelles problématiques. Afin de les résoudre, nous utilisons deux outils provenant de la morphologie mathématique : l’arbre des formes multivarié et l’apprentissage de treillis complet. Enfin, nous confirmons les conclusions de nos précédents travaux pour l’estimation de la fonction de niveau de bruit couleur, montrant que l’adaptation au contenu d’une image donne de meilleures performances que l’utilisation de blocs carrés.

Continue reading

Généricité dynamique pour des algorithmes morphologiques

By Baptiste Esteban, Edwin Carlinet, Guillaume Tochon, Didier Verna

2022-06-15

In 28e colloque sur le traitement du signal et des images

Abstract

La généricité est un paradigme puissant dont l’usage permet d’implémenter un unique algorithme et de l’exécuter sur différents types de données. De ce fait, il est très utilisé lors du développement d’une bibliothèque scientifique, notamment en traitement d’images où les algorithmes peuvent s’appliquer à différents types d’images. Le langage C++ est un langage de choix pour ce genre de bibliothèque. Il supporte ce paradigme et ses applications sont performantes compte tenu de sa nature compilée. Néanmoins, contrairement à des langages dynamiques tels que Python ou Julia, ses capacités en matière d’interactivité, utiles lors des étapes de prototypage d’algorithmes, sont limitées en raison de sa nature statique. Nous proposons donc dans cet article une revue des différentes techniques qui permettent d’utiliser à la fois le polymorphisme statique et dynamique, puis nous évaluons le coût du transfert d’information statique vers des informations connues à l’exécution. En particulier, nous montrons que certaines informations d’une image sont plus importantes que d’autres en matière de performance, et que le surcoût dépend aussi de l’algorithme utilisé.

Continue reading

Analyse structurelle de l’influence du bruit sur l’arbre alpha

By Baptiste Esteban, Guillaume Tochon, Edwin Carlinet, Didier Verna

2022-06-14

In 29e colloque sur le traitement du signal et des images

Abstract

L’arbre alpha est une représentation hiérarchique utilisée dans divers traitements d’une image tels que la segmentation ou la simplification. Ces traitements sont néanmoins sensibles au bruit, ce qui nécessite parfois de les adapter. Or, l’influence du bruit sur la structure de l’arbre alpha n’a été que peu étudiée dans la littérature. Ainsi, nous proposons une étude de l’impact du bruit en fonction de son niveau sur la structure de l’arbre. De plus, nous étendons cette étude à la persistance des nœuds de l’arbre en fonction d’une énergie donnée, et nous concluons que certaines fonctionnelles sont plus sensibles au bruit que d’autres.

Continue reading